3 The Causal Hierarchy and Three Worlds

Status: Draft

v0.3

3.1 Learning Objectives

After reading this chapter, you will be able to:

Understand the three modelling layers (Structural, Dynamical, Observable) and what each layer contributes
Understand the three levels of Reason (Association, Intervention, Counterfactual) in Pearl’s causal hierarchy
See how the three worlds and three levels complement each other
Apply the framework to identify what questions can and cannot be answered from data

3.2 Introduction

This book develops a unified framework for causal reasoning about complex dynamical systems. The framework is grounded in three complementary structures:

Three modelling layers (Structural, Dynamical, Observable): A way to organise assumptions in a data-generating process at different levels of abstraction.
Three levels of Reason (Association, Intervention, Counterfactual): A way to organise causal questions by what they require from a model (correlational, interventional, and counterfactual modes).

These structures work together: the three layers clarify which assumptions are structural, which are dynamical, and which live at the data/measurement level, while the three levels of Reason clarify which queries are feasible under which assumptions.

3.3 Three Worlds: Structural, Dynamical, Observable

Our framework uses three modelling layers:

Structural: causal structure and invariances (graphs, mechanisms, identification assumptions).
Dynamical: time evolution (state transitions, ODEs/SDEs, feedback, attractors).
Observable: measurement and data (observation models, noise, estimators, diagnostics).

The key idea is that assumptions constrain what can be learned. Structural assumptions constrain the class of admissible dynamical laws, and dynamical assumptions constrain what patterns can plausibly appear in data. This asymmetry is standard in statistical modelling: the data do not uniquely determine a model class without prior structure.

3.3.1 How the Layers Relate

The layers are related in a nested way:

The Structural layer specifies which variables can affect which others (and which assumptions we treat as invariant).
The Dynamical layer specifies how those variables evolve through time under those structural constraints.
The Observable layer specifies how we measure the system, which variables we observe directly, and how noise enters.

If you like a philosophical interpretation, you can read this as a kind of “constraint hierarchy” (more abstract assumptions constrain more concrete predictions). Throughout the book, we will keep the technical content primary and treat any philosophical terminology as optional gloss.

3.3.2 How Structural, Dynamical, and Observable Fit Together

The progression from Structural \(\to\) Dynamical \(\to\) Observable can be read as a modelling stack: structural assumptions constrain the admissible dynamics, and the combination of structure and dynamics constrains the distribution of observables.

This progression is captured mathematically by the observation model:

\[ Y_t = h(X_t, C, U^y_t) \]

The latent state \(X_t\) is part of the structural/dynamical description, and the observation model \(h\) specifies how \(X_t\) generates the observable \(Y_t\) (up to measurement noise).

3.4 Three Levels of Reason: The Causal Hierarchy

The Pearl Causal Hierarchy (PCH), also known as the Pearl/Bareinboim causal hierarchy (Bareinboim and Pearl 2016), provides a fundamental framework for understanding what questions can be answered from data (Pearl 2009). This hierarchy formalises three levels of Reason (rational)—distinct modes of reasoning: Seeing (association), Doing (intervention), and Imagining (counterfactual).

3.4.1 The Three Levels of the PCH

The PCH organises causal reasoning into three levels, each corresponding to a different cognitive capability:

3.4.1.1 Level 1: Association (Seeing)

Question: “What will happen if I observe X?”

Capability: Seeing—observing patterns and correlations in data.

Conditional distributions: \(P(Y_t \mid X_t)\)
Correlations and associations
Predictive models

Limitation: Cannot answer “what if I change X?”—seeing correlations doesn’t tell us what happens under intervention.

World context: Primarily operates in the Observable world—what we can directly observe and measure.

3.4.1.2 Level 2: Intervention (Doing)

Question: “What will happen if I do X?”

Capability: Doing—predicting the effects of actions and interventions.

Interventional distributions: \(P^{do(X_t = x)}(Y_t)\)
Requires structural assumptions
Answers policy questions

Notation: \(do(\cdot)\) operator (Pearl 2009)

Requirement: Access to causal structure (graph or mechanisms) to reason about interventions.

Layer context: Requires a structural model (graph/mechanisms) plus a data-generating story for how interventions modify the mechanism and propagate through the dynamics into observables.

3.4.1.3 Level 3: Counterfactual (Imagining)

Question: “What would have happened for this specific unit if X had been different?”

Capability: Imagining—reasoning about alternative realities for specific units.

Unit-level counterfactuals: \(Y^{do(x)}(\mathbf{u})\)
Requires shared exogenous noise structure
Strongest form of causal reasoning

Requirement: Full structural causal model with explicit exogenous noise to reason about the same unit under different interventions.

Layer context: Replays the model for a specific unit (fixed \(\mathbf{u}\)) under an alternative intervention, producing a unit-level alternative outcome.

3.4.2 How Levels and Worlds Complement Each Other

The three levels of Reason complement the three modelling layers:

Three layers organise modelling assumptions: Structural (causal structure/invariances), Dynamical (time evolution), Observable (measurement and data).
Three levels describe how we reason about causal questions: Association (seeing), Intervention (doing), Counterfactual (imagining).

Mapping: - Level 1 (Association): Works with the observed distribution (what you can estimate from data). - Level 2 (Intervention): Requires a causal model to define and compute interventional distributions. - Level 3 (Counterfactual): Requires unit identity via shared exogenous variables \(\mathbf{u}\) (or a posterior over them) to define unit-level alternative outcomes.

Practically, these three levels trace how much structure you need to assume. Association stays within the observed distribution. Intervention requires a causal model. Counterfactuals require the strongest assumptions because they are defined at the unit level (fixed \(\mathbf{u}\)) and are therefore underdetermined without an explicit structural model (Pearl 2009).

3.5 Why This Framework Matters

Combining the three modelling layers with Pearl’s causal hierarchy provides a compact way to organise modelling choices and the queries they support:

3.5.1 1. Clarifies the Nature of Latent States

Without the three-layer view: Latent states are “unobserved variables” (a purely negative definition). With the three-layer view: latent states are internal variables evolved by the dynamical model and mapped to data by the observation model.

3.5.2 2. Makes Sense of Counterfactuals

Without the three-layer view: Counterfactuals require shared exogenous noise, but this can feel ad hoc. With the three-layer view: shared \(\mathbf{u}\) is exactly what fixes unit identity across worlds, making a counterfactual a unit-level object.

3.5.3 3. Explains Why Interventions Work

Without the three-layer view: Interventions are “parameter changes” or “graph surgery”. With the three-layer view: an intervention defines a modified generating process, whose consequences are propagated through the dynamics into observables.

This also highlights a standard point in statistics: the observed distribution typically underdetermines the structural model. Many different causal structures can fit the same data without additional assumptions (or interventions) (Pearl 2009).

3.5.4 4. Unifies Different Types of Reasoning

All three levels are unified by one idea: different queries require different assumptions about the data-generating process.

3.6 Conditional vs Interventional Forecasting

In dynamical systems, a critical distinction arises (Robins 1986; Robins et al. 2000):

Conditional forecasting: \(P(Y_{t+1} \mid Y_{1:t}, A_t = a)\) — “What happens if I observe treatment \(a\)?” (Level 1: Association)
Interventional forecasting: \(P^{do(A_t = a)}(Y_{t+1} \mid Y_{1:t})\) — “What happens if I set treatment to \(a\)?” (Level 2: Intervention)

These differ when treatment assignment is confounded. The interventional forecast requires causal structure, while the conditional forecast only uses observed associations.

3.7 Feedback and Partial Observability

Time-evolving systems introduce challenges:

Feedback: Past outcomes affect future treatments (Robins 1986; Robins et al. 2000)
Partial observability: Latent states must be inferred (Durbin and Koopman 2012; Särkkä 2013)
Time-varying confounding: Confounders evolve over time (Robins 1986; Robins et al. 2000; Hernán and Robins 2020)

These require explicit causal semantics (Level 2 and Level 3 reasoning) to reason correctly. The three-layer framework helps: we need a structural model, a dynamical model, and an observation model.

3.8 Worked Example: Sheep System

3.8.1 Worked Example: Three Worlds and Three Levels

Consider a simple sheep population model with predator control:

Three Layers Perspective: - Structural: causal structure and invariances (what affects what, and what is held fixed across contexts) - Dynamical: time evolution (predator-prey dynamics, attractors, feedback) - Observable: what is measured (noisy population counts, recorded interventions)

Three Levels Perspective: - L1 (Seeing/Association): “What is the association between predator removal and sheep population?” — Observing patterns in data. - L2 (Doing/Intervention): “What happens to sheep population if we remove predators?” — Predicting the effect of an action under a causal model. - L3 (Imagining/Counterfactual): “What would the sheep population have been for this specific ecosystem if predators had been removed earlier?” — Unit-level alternative outcome given shared \(\mathbf{u}\).

The L3 query requires unit identity via shared \(\mathbf{u}\) (or an inferred posterior over \(\mathbf{u}\)) plus an explicit structural model.

3.9 Key Takeaways

The three-layer framework (Structural, Dynamical, Observable) organises modelling assumptions
The three levels of Reason (Association, Intervention, Counterfactual) describe how we reason about what exists
Layers and levels complement each other: layers organise modelling assumptions, levels organise causal queries
Structural assumptions constrain dynamics, which constrain observables
In dynamical systems, conditional and interventional forecasts differ
Feedback and partial observability require explicit causal modelling (Level 2 and Level 3)
Counterfactuals are the strongest form of causal reasoning but require the strongest assumptions

3.10 Further Reading

Pearl (2009): Causality — The foundational work on causal hierarchy
Bareinboim and Pearl (2016): “Causal inference and the data-fusion problem”
Hart (2024): All Things Are One — Philosophical background (optional)