10 Counterfactuals: Unit-Level Alternatives at Structural Level

Status: Draft

v0.4

10.1 Learning Objectives

After reading this chapter, you will be able to:

Formalise counterfactual queries using shared exogenous variables at the Structural level
Understand why counterfactuals are stronger objects than interventional averages
Recognise what must be assumed (and what cannot be recovered) for structural counterfactuals
Use graph structure to determine what’s needed for counterfactual reasoning

10.2 Introduction

Counterfactuals ask: “What would have happened for this specific unit under an alternative?” (Pearl 2009; Imbens and Rubin 2015) At the Structural level, counterfactuals are defined by keeping the same unit’s exogenous variables \(\mathbf{u}\) fixed while changing the intervention and replaying the same structural mechanisms.

10.3 Counterfactuals as Unit-Level Replays

Counterfactuals compare outcomes under different interventions for the same unit, holding fixed what is not modelled (the exogenous realisation). When we ask “What would have happened for this specific unit if conditions had been different?”, we are asking for a unit-level comparison under the same latent/exogenous conditions.

In our three-layer framework (Structural, Dynamical, Observable), counterfactuals at the Structural level involve:

For a specific unit: Fixed \(\mathbf{u}\) (exogenous realisation)
Under alternative conditions: Different interventions applied to the same mechanisms

The counterfactual \(Y^{do(A = a)}(\mathbf{u})\) represents the outcome for the same unit (same \(\mathbf{u}\)) under a different intervention.

10.3.1 Why Counterfactuals Require Exogenous Variables

Counterfactuals require:

A structural model: mechanisms \(F\) and a graph \(G\)
A unit identity: fixed \(\mathbf{u}\) (or a posterior over \(\mathbf{u}\) given evidence)
A well-defined intervention: a modified assignment under \(do(\cdot)\)

This is fundamentally different from: - What actually happened: The Observable layer (what we observed) - What will happen on average: Population-level interventions

Counterfactuals are the strongest form of causal reasoning because they require understanding both: - The Observable: What actually happened - The Structural: What could have happened (alternative structural configurations)

10.3.2 Shared Exogenous Noise as Unit Identity

The requirement for shared exogenous noise \(\mathbf{u}\) is what makes a counterfactual a unit-level object. It lets us ask: “For this specific unit (same \(\mathbf{u}\)), what would have happened under an alternative intervention?”

10.4 Counterfactuals vs Interventions

Intervention (Level 2): Average over all units (Pearl 2009) \[ \mathbb{E}[Y^{do(A=1)}] = \int Y^{do(A=1)}(\mathbf{u}) P(\mathbf{u}) \, d\mathbf{u} \]

Counterfactual (Level 3): For a specific unit (fixed \(\mathbf{u}\)) (Imbens and Rubin 2015; Richardson and Robins 2013) \[ Y^{do(A=1)}(\mathbf{u}) \quad \text{vs} \quad Y^{do(A=0)}(\mathbf{u}) \]

Counterfactuals require unit-level reasoning, not just population averages.

10.5 Shared Exogenous Noise

The key to counterfactuals is shared exogenous noise:

Same unit \(\Leftrightarrow\) same realisation of \(\mathbf{U}\)
Different worlds \(\Leftrightarrow\) different interventions
Counterfactual: Same \(\mathbf{u}\), different \(do(\cdot)\)

10.5.1 Implementation: Graph Structure and Exogenous Noise

The causal graph structure determines which exogenous variables must be shared for counterfactual reasoning:

# Find project root and include ensure_packages.jl
project_root = let
    current = pwd()
    while !isfile(joinpath(current, "Project.toml")) && !isfile(joinpath(current, "_quarto.yml"))
        parent = dirname(current)
        parent == current && break
        current = parent
    end
    current
end
include(joinpath(project_root, "scripts", "ensure_packages.jl"))
@auto_using DAGMakie CairoMakie CausalDynamics Graphs

# Example: Counterfactual requires shared U
# Graph: U → X, U → Y, X → Y
# Nodes: 1=U, 2=X, 3=Y
g = DiGraph(3)
add_edge!(g, 1, 2)  # U → X
add_edge!(g, 1, 3)  # U → Y
add_edge!(g, 2, 3)  # X → Y

# To compute Y^{do(X=0)}(u) given Y^{do(X=1)}(u) = y_obs:
# We need the same u for both worlds

# The graph shows U affects both X and Y
# This means U must be shared across counterfactual worlds

# Check what variables affect Y (these determine which U's must be shared)
parents_Y = get_parents(g, 3)
println("Parents of Y: ", parents_Y)  # Set([1, 2]) = {U, X}

# For counterfactual Y^{do(X=x)}(u), we need:
# - Same U (exogenous noise affecting Y)
# - Different X (intervention changes X)

# The ancestors of Y tell us all variables that could affect Y:
ancestors_Y = get_ancestors(g, 3)
println("Ancestors of Y: ", ancestors_Y)  # Set([1, 2]) = {U, X}

# This shows: for counterfactual reasoning about Y,
# we need to fix U (shared exogenous noise) and vary X (intervention)

# Visualise graph
let
    # Highlight U (shared exogenous) in yellow, treatment and outcome in lightblue
    node_colors = [:yellow, :lightblue, :lightblue]

    fig, ax, p = dagplot(g;
        figure_size = (600, 400),
        layout_mode = :acyclic,
        node_color = node_colors,
        nlabels = ["U (exogenous)", "X (treatment)", "Y (outcome)"]
    )
    fig  # Only this gets displayed
end

Parents of Y: Set([2, 1])
Ancestors of Y: Set([2, 1])

Counterfactual at Structural level: shared exogenous noise U enables alternative structural configurations

10.5.2 Implementation: Graph Structure for Counterfactual Reasoning

The causal graph structure determines what information is needed for counterfactual reasoning. We can use CausalDynamics.jl to identify necessary variables:

# Find project root and include ensure_packages.jl
project_root = let
    current = pwd()
    while !isfile(joinpath(current, "Project.toml")) && !isfile(joinpath(current, "_quarto.yml"))
        parent = dirname(current)
        parent == current && break
        current = parent
    end
    current
end
include(joinpath(project_root, "scripts", "ensure_packages.jl"))
@auto_using CausalDynamics Graphs

# Example: Treatment counterfactual
# Graph: U → A, U → Y, A → Y
# Nodes: 1=U, 2=A, 3=Y
g = DiGraph(3)
add_edge!(g, 1, 2)  # U → A
add_edge!(g, 1, 3)  # U → Y
add_edge!(g, 2, 3)  # A → Y

# To compute counterfactual Y^{do(A=0)}(u) for unit with observed Y^{do(A=1)}(u) = y_obs:
# 1. We need to infer u from observations
# 2. The Markov boundary of Y tells us what variables are needed

mb_Y = markov_boundary(g, 3)  # Outcome Y
println("Markov boundary of Y: ", mb_Y)  # Set([1, 2]) = {U, A}

# This tells us: to reason about Y counterfactually, we need U and A
# If U is unobserved, counterfactuals are not fully identified

# Check if A → Y is identifiable (necessary for counterfactuals)
adj_set = backdoor_adjustment_set(g, 2, 3)
println("Adjustment set for A → Y: ", adj_set)  # Set([1]) = {U}

# If U is unobserved, we cannot identify the causal effect,
# and therefore cannot compute counterfactuals

Markov boundary of Y: Set([2, 1])
Adjustment set for A → Y: Set([1])

10.6 Formal Definition

For a fixed exogenous realisation \(\mathbf{u}\), the counterfactual outcome under intervention \(\iota\) is: \[ Y^{\iota}(\mathbf{u}) \]

To compute this, we: 1. Infer the exogenous noise \(\mathbf{u}\) from observed data 2. Simulate the counterfactual world with the same \(\mathbf{u}\) but different intervention

10.7 What Can and Cannot Be Recovered

10.7.1 What We Need

Structural assignments: The mechanisms \(f_i\)
Exogenous noise structure: Distribution \(P(\mathbf{U})\)
Observational data: To infer \(\mathbf{u}\) for specific units
Causal graph structure: To determine what variables are needed for counterfactual inference

10.7.2 What We Cannot Recover

Unobserved confounders: If \(\mathbf{U}\) is not fully observed, counterfactuals may be only partially identified
Non-identifiable mechanisms: If structure is unknown, counterfactuals are not identified

10.8 Practical Guidance for Structural Counterfactuals

In structural systems, counterfactuals require:

Explicit noise models: Make all randomness explicit
Identifiable structure: Causal graph must be known or learnable
Sufficient data: To infer unit-specific noise realisations
Sensitivity analysis: Test robustness to assumptions

10.8.1 Implementation: Checking Counterfactual Identifiability

We can use graph structure to check whether counterfactuals are identifiable:

# Find project root and include ensure_packages.jl
project_root = let
    current = pwd()
    while !isfile(joinpath(current, "Project.toml")) && !isfile(joinpath(current, "_quarto.yml"))
        parent = dirname(current)
        parent == current && break
        current = parent
    end
    current
end
include(joinpath(project_root, "scripts", "ensure_packages.jl"))
@auto_using CausalDynamics Graphs

# Example: Can we compute counterfactual Y^{do(A=0)}(u) given Y^{do(A=1)}(u) = y_obs?
# Graph: U → A, U → Y, A → Y, L → A, L → Y (L is observed confounder)
# Nodes: 1=U, 2=A, 3=Y, 4=L
g = DiGraph(4)
add_edge!(g, 1, 2)  # U → A
add_edge!(g, 1, 3)  # U → Y
add_edge!(g, 2, 3)  # A → Y
add_edge!(g, 4, 2)  # L → A
add_edge!(g, 4, 3)  # L → Y

# Check if A → Y is identifiable (necessary condition for counterfactuals)
is_identifiable = is_backdoor_adjustable(g, 2, 3)
println("A → Y is identifiable: ", is_identifiable)  # true

adj_set = backdoor_adjustment_set(g, 2, 3)
println("Adjustment set: ", adj_set)  # Set([1, 4]) = {U, L}

# Problem: If U is unobserved, we cannot adjust for it
# This means counterfactuals are not fully identified

# However, if we can infer U from observations (e.g., via state-space inference),
# then counterfactuals become possible

# The Markov boundary tells us what we need to observe:
mb_Y = markov_boundary(g, 3)
println("Variables needed for Y: ", mb_Y)  # Set([1, 2, 4]) = {U, A, L}

# If U is unobserved, we need to infer it from other variables
# This requires additional assumptions about the noise structure

A → Y is identifiable: true
Adjustment set: Set([4, 1])
Variables needed for Y: Set([4, 2, 1])

10.9 Bounds and Partial Identification

When full counterfactuals are not identified, we can still obtain bounds:

Non-parametric bounds: Range of possible counterfactual values
Sensitivity parameters: How results change with assumptions about unobserved confounders

10.9.1 Implementation: Partial Identification

Graph structure can help identify when counterfactuals are only partially identified:

# Find project root and include ensure_packages.jl
project_root = let
    current = pwd()
    while !isfile(joinpath(current, "Project.toml")) && !isfile(joinpath(current, "_quarto.yml"))
        parent = dirname(current)
        parent == current && break
        current = parent
    end
    current
end
include(joinpath(project_root, "scripts", "ensure_packages.jl"))
@auto_using CausalDynamics Graphs

# Example: Counterfactual with unobserved confounder
# Graph: U → A, U → Y, A → Y (U unobserved)
# Nodes: 1=U, 2=A, 3=Y
g = DiGraph(3)
add_edge!(g, 1, 2)  # U → A
add_edge!(g, 1, 3)  # U → Y
add_edge!(g, 2, 3)  # A → Y

# Check identifiability
is_identifiable = is_backdoor_adjustable(g, 2, 3)
println("A → Y is identifiable: ", is_identifiable)  # true

adj_set = backdoor_adjustment_set(g, 2, 3)
println("Required adjustment: ", adj_set)  # Set([1]) = {U}

# Problem: U is unobserved, so we cannot adjust for it
# Result: Counterfactuals are not fully identified

# However, we can still reason about bounds:
# - The graph structure shows U is a confounder
# - We can use sensitivity analysis to bound counterfactual values
# - The d-separation structure shows what independence assumptions hold

# Check d-separation: A and Y are not d-separated (confounded)
println("A ⫫ Y (no adjustment): ", CausalDynamics.d_separated(g, 2, 3, []))  # false

# But if we could condition on U, they would be d-separated
println("A ⫫ Y | U: ", CausalDynamics.d_separated(g, 2, 3, [1]))  # true

# This structure tells us:
# - Full counterfactuals require U (unobserved)
# - Partial identification is possible via sensitivity analysis
# - Bounds depend on assumptions about U's distribution

A → Y is identifiable: true
Required adjustment: Set([1])
A ⫫ Y (no adjustment): false
A ⫫ Y | U: false

Modelling note

Structural counterfactuals ask how the outcome for a specific unit would change under an alternative intervention, holding fixed the unit’s exogenous realisation \(U\). Limits on counterfactual identification revealed by the graph, especially when key confounders are unobserved, can be read as limits on what can be recovered from data without stronger assumptions.

10.10 World Context

This chapter addresses Imagining in the Structural layer: what unit-level alternative outcomes are implied by a structural model? Counterfactuals at the Structural level compare alternative interventions for the same unit (fixed \(\mathbf{u}\)). This is distinct from: - Counterfactuals at Dynamical level (Chapter 16): Alternative dynamic trajectories - Counterfactuals at Observable level (Chapter 25): Alternative observable outcomes

10.11 Key Takeaways

Counterfactuals require unit-level reasoning with shared exogenous noise
They are stronger than interventional averages but require stronger assumptions
In structural systems, explicit noise models and identifiable structure are essential
Bounds and sensitivity analysis are valuable when full identification is impossible
Counterfactuals at Structural level explore alternative structural configurations

10.12 Further Reading

Pearl (2009): Causality, Chapter 7
Imbens and Rubin (2015): Causal Inference
Richardson and Robins (2013): “Single world intervention graphs”
Transportability: Generalising Structural Claims: Generalising counterfactual claims across domains
From Structure to Time: FEP and Attractors: Transition to Dynamical world