10Counterfactuals: Unit-Level Alternatives at Structural Level
Status: Draft
v0.4
10.1 Learning Objectives
After reading this chapter, you will be able to:
Formalise counterfactual queries using shared exogenous variables at the Structural level
Understand why counterfactuals are stronger objects than interventional averages
Recognise what must be assumed (and what cannot be recovered) for structural counterfactuals
Use graph structure to determine whatβs needed for counterfactual reasoning
10.2 Introduction
Counterfactuals ask: βWhat would have happened for this specific unit under an alternative?β (Pearl 2009; Imbens and Rubin 2015) At the Structural level, counterfactuals are defined by keeping the same unitβs exogenous variables \(\mathbf{u}\) fixed while changing the intervention and replaying the same structural mechanisms.
10.3 Counterfactuals as Unit-Level Replays
Counterfactuals compare outcomes under different interventions for the same unit, holding fixed what is not modelled (the exogenous realisation). When we ask βWhat would have happened for this specific unit if conditions had been different?β, we are asking for a unit-level comparison under the same latent/exogenous conditions.
In our three-layer framework (Structural, Dynamical, Observable), counterfactuals at the Structural level involve:
For a specific unit: Fixed \(\mathbf{u}\) (exogenous realisation)
Under alternative conditions: Different interventions applied to the same mechanisms
The counterfactual \(Y^{do(A = a)}(\mathbf{u})\) represents the outcome for the same unit (same \(\mathbf{u}\)) under a different intervention.
A structural model: mechanisms \(F\) and a graph \(G\)
A unit identity: fixed \(\mathbf{u}\) (or a posterior over \(\mathbf{u}\) given evidence)
A well-defined intervention: a modified assignment under \(do(\cdot)\)
This is fundamentally different from: - What actually happened: The Observable layer (what we observed) - What will happen on average: Population-level interventions
Counterfactuals are the strongest form of causal reasoning because they require understanding both: - The Observable: What actually happened - The Structural: What could have happened (alternative structural configurations)
10.3.2 Shared Exogenous Noise as Unit Identity
The requirement for shared exogenous noise \(\mathbf{u}\) is what makes a counterfactual a unit-level object. It lets us ask: βFor this specific unit (same \(\mathbf{u}\)), what would have happened under an alternative intervention?β
10.4 Counterfactuals vs Interventions
Intervention (Level 2): Average over all units (Pearl 2009)\[
\mathbb{E}[Y^{do(A=1)}] = \int Y^{do(A=1)}(\mathbf{u}) P(\mathbf{u}) \, d\mathbf{u}
\]
Counterfactual (Level 3): For a specific unit (fixed \(\mathbf{u}\)) (Imbens and Rubin 2015; Richardson and Robins 2013)\[
Y^{do(A=1)}(\mathbf{u}) \quad \text{vs} \quad Y^{do(A=0)}(\mathbf{u})
\]
Counterfactuals require unit-level reasoning, not just population averages.
10.5 Shared Exogenous Noise
The key to counterfactuals is shared exogenous noise:
Same unit \(\Leftrightarrow\) same realisation of \(\mathbf{U}\)
Different worlds \(\Leftrightarrow\) different interventions
Counterfactual: Same \(\mathbf{u}\), different \(do(\cdot)\)
10.5.1 Implementation: Graph Structure and Exogenous Noise
The causal graph structure determines which exogenous variables must be shared for counterfactual reasoning:
# Find project root and include ensure_packages.jlproject_root =let current =pwd()while !isfile(joinpath(current, "Project.toml")) && !isfile(joinpath(current, "_quarto.yml")) parent =dirname(current) parent == current &&break current = parentend currentendinclude(joinpath(project_root, "scripts", "ensure_packages.jl"))@auto_using DAGMakie CairoMakie CausalDynamics Graphs# Example: Counterfactual requires shared U# Graph: U β X, U β Y, X β Y# Nodes: 1=U, 2=X, 3=Yg =DiGraph(3)add_edge!(g, 1, 2) # U β Xadd_edge!(g, 1, 3) # U β Yadd_edge!(g, 2, 3) # X β Y# To compute Y^{do(X=0)}(u) given Y^{do(X=1)}(u) = y_obs:# We need the same u for both worlds# The graph shows U affects both X and Y# This means U must be shared across counterfactual worlds# Check what variables affect Y (these determine which U's must be shared)parents_Y =get_parents(g, 3)println("Parents of Y: ", parents_Y) # Set([1, 2]) = {U, X}# For counterfactual Y^{do(X=x)}(u), we need:# - Same U (exogenous noise affecting Y)# - Different X (intervention changes X)# The ancestors of Y tell us all variables that could affect Y:ancestors_Y =get_ancestors(g, 3)println("Ancestors of Y: ", ancestors_Y) # Set([1, 2]) = {U, X}# This shows: for counterfactual reasoning about Y,# we need to fix U (shared exogenous noise) and vary X (intervention)# Visualise graphlet# Highlight U (shared exogenous) in yellow, treatment and outcome in lightblue node_colors = [:yellow, :lightblue, :lightblue] fig, ax, p =dagplot(g; figure_size = (600, 400), layout_mode =:acyclic, node_color = node_colors, nlabels = ["U (exogenous)", "X (treatment)", "Y (outcome)"] ) fig # Only this gets displayedend
Parents of Y: Set([2, 1])
Ancestors of Y: Set([2, 1])
Counterfactual at Structural level: shared exogenous noise U enables alternative structural configurations
10.5.2 Implementation: Graph Structure for Counterfactual Reasoning
The causal graph structure determines what information is needed for counterfactual reasoning. We can use CausalDynamics.jl to identify necessary variables:
# Find project root and include ensure_packages.jlproject_root =let current =pwd()while !isfile(joinpath(current, "Project.toml")) && !isfile(joinpath(current, "_quarto.yml")) parent =dirname(current) parent == current &&break current = parentend currentendinclude(joinpath(project_root, "scripts", "ensure_packages.jl"))@auto_using CausalDynamics Graphs# Example: Treatment counterfactual# Graph: U β A, U β Y, A β Y# Nodes: 1=U, 2=A, 3=Yg =DiGraph(3)add_edge!(g, 1, 2) # U β Aadd_edge!(g, 1, 3) # U β Yadd_edge!(g, 2, 3) # A β Y# To compute counterfactual Y^{do(A=0)}(u) for unit with observed Y^{do(A=1)}(u) = y_obs:# 1. We need to infer u from observations# 2. The Markov boundary of Y tells us what variables are neededmb_Y =markov_boundary(g, 3) # Outcome Yprintln("Markov boundary of Y: ", mb_Y) # Set([1, 2]) = {U, A}# This tells us: to reason about Y counterfactually, we need U and A# If U is unobserved, counterfactuals are not fully identified# Check if A β Y is identifiable (necessary for counterfactuals)adj_set =backdoor_adjustment_set(g, 2, 3)println("Adjustment set for A β Y: ", adj_set) # Set([1]) = {U}# If U is unobserved, we cannot identify the causal effect,# and therefore cannot compute counterfactuals
Markov boundary of Y: Set([2, 1])
Adjustment set for A β Y: Set([1])
10.6 Formal Definition
For a fixed exogenous realisation \(\mathbf{u}\), the counterfactual outcome under intervention \(\iota\) is: \[
Y^{\iota}(\mathbf{u})
\]
To compute this, we: 1. Infer the exogenous noise \(\mathbf{u}\) from observed data 2. Simulate the counterfactual world with the same \(\mathbf{u}\) but different intervention
10.7 What Can and Cannot Be Recovered
10.7.1 What We Need
Structural assignments: The mechanisms \(f_i\)
Exogenous noise structure: Distribution \(P(\mathbf{U})\)
Observational data: To infer \(\mathbf{u}\) for specific units
Causal graph structure: To determine what variables are needed for counterfactual inference
10.7.2 What We Cannot Recover
Unobserved confounders: If \(\mathbf{U}\) is not fully observed, counterfactuals may be only partially identified
Non-identifiable mechanisms: If structure is unknown, counterfactuals are not identified
10.8 Practical Guidance for Structural Counterfactuals
In structural systems, counterfactuals require:
Explicit noise models: Make all randomness explicit
Identifiable structure: Causal graph must be known or learnable
Sufficient data: To infer unit-specific noise realisations
Sensitivity analysis: Test robustness to assumptions
We can use graph structure to check whether counterfactuals are identifiable:
# Find project root and include ensure_packages.jlproject_root =let current =pwd()while !isfile(joinpath(current, "Project.toml")) && !isfile(joinpath(current, "_quarto.yml")) parent =dirname(current) parent == current &&break current = parentend currentendinclude(joinpath(project_root, "scripts", "ensure_packages.jl"))@auto_using CausalDynamics Graphs# Example: Can we compute counterfactual Y^{do(A=0)}(u) given Y^{do(A=1)}(u) = y_obs?# Graph: U β A, U β Y, A β Y, L β A, L β Y (L is observed confounder)# Nodes: 1=U, 2=A, 3=Y, 4=Lg =DiGraph(4)add_edge!(g, 1, 2) # U β Aadd_edge!(g, 1, 3) # U β Yadd_edge!(g, 2, 3) # A β Yadd_edge!(g, 4, 2) # L β Aadd_edge!(g, 4, 3) # L β Y# Check if A β Y is identifiable (necessary condition for counterfactuals)is_identifiable =is_backdoor_adjustable(g, 2, 3)println("A β Y is identifiable: ", is_identifiable) # trueadj_set =backdoor_adjustment_set(g, 2, 3)println("Adjustment set: ", adj_set) # Set([1, 4]) = {U, L}# Problem: If U is unobserved, we cannot adjust for it# This means counterfactuals are not fully identified# However, if we can infer U from observations (e.g., via state-space inference),# then counterfactuals become possible# The Markov boundary tells us what we need to observe:mb_Y =markov_boundary(g, 3)println("Variables needed for Y: ", mb_Y) # Set([1, 2, 4]) = {U, A, L}# If U is unobserved, we need to infer it from other variables# This requires additional assumptions about the noise structure
A β Y is identifiable: true
Adjustment set: Set([4, 1])
Variables needed for Y: Set([4, 2, 1])
10.9 Bounds and Partial Identification
When full counterfactuals are not identified, we can still obtain bounds:
Non-parametric bounds: Range of possible counterfactual values
Sensitivity parameters: How results change with assumptions about unobserved confounders
10.9.1 Implementation: Partial Identification
Graph structure can help identify when counterfactuals are only partially identified:
# Find project root and include ensure_packages.jlproject_root =let current =pwd()while !isfile(joinpath(current, "Project.toml")) && !isfile(joinpath(current, "_quarto.yml")) parent =dirname(current) parent == current &&break current = parentend currentendinclude(joinpath(project_root, "scripts", "ensure_packages.jl"))@auto_using CausalDynamics Graphs# Example: Counterfactual with unobserved confounder# Graph: U β A, U β Y, A β Y (U unobserved)# Nodes: 1=U, 2=A, 3=Yg =DiGraph(3)add_edge!(g, 1, 2) # U β Aadd_edge!(g, 1, 3) # U β Yadd_edge!(g, 2, 3) # A β Y# Check identifiabilityis_identifiable =is_backdoor_adjustable(g, 2, 3)println("A β Y is identifiable: ", is_identifiable) # trueadj_set =backdoor_adjustment_set(g, 2, 3)println("Required adjustment: ", adj_set) # Set([1]) = {U}# Problem: U is unobserved, so we cannot adjust for it# Result: Counterfactuals are not fully identified# However, we can still reason about bounds:# - The graph structure shows U is a confounder# - We can use sensitivity analysis to bound counterfactual values# - The d-separation structure shows what independence assumptions hold# Check d-separation: A and Y are not d-separated (confounded)println("A β«« Y (no adjustment): ", CausalDynamics.d_separated(g, 2, 3, [])) # false# But if we could condition on U, they would be d-separatedprintln("A β«« Y | U: ", CausalDynamics.d_separated(g, 2, 3, [1])) # true# This structure tells us:# - Full counterfactuals require U (unobserved)# - Partial identification is possible via sensitivity analysis# - Bounds depend on assumptions about U's distribution
A β Y is identifiable: true
Required adjustment: Set([1])
A β«« Y (no adjustment): false
A β«« Y | U: false
NoteModelling note
Structural counterfactuals ask how the outcome for a specific unit would change under an alternative intervention, holding fixed the unitβs exogenous realisation \(U\). Limits on counterfactual identification revealed by the graph, especially when key confounders are unobserved, can be read as limits on what can be recovered from data without stronger assumptions.
10.10 World Context
This chapter addresses Imagining in the Structural layer: what unit-level alternative outcomes are implied by a structural model? Counterfactuals at the Structural level compare alternative interventions for the same unit (fixed \(\mathbf{u}\)). This is distinct from: - Counterfactuals at Dynamical level (Chapter 16): Alternative dynamic trajectories - Counterfactuals at Observable level (Chapter 25): Alternative observable outcomes
10.11 Key Takeaways
Counterfactuals require unit-level reasoning with shared exogenous noise
They are stronger than interventional averages but require stronger assumptions
In structural systems, explicit noise models and identifiable structure are essential
Bounds and sensitivity analysis are valuable when full identification is impossible
Counterfactuals at Structural level explore alternative structural configurations
Imbens, Guido W., and Donald B. Rubin. 2015. Causal Inference in Statistics, Social, and Biomedical Sciences. Cambridge University Press.
Pearl, Judea. 2009. Causality: Models, Reasoning, and Inference. 2nd ed. Cambridge University Press.
Richardson, Thomas S., and James M. Robins. 2013. βSingle World Intervention Graphs (SWIGs): A Unification of the Counterfactual and Graphical Approaches to Causality.βCenter for the Statistics and the Social Sciences, University of Washington Series, no. 128.