6  Structural Causal Models as Executable Mechanisms

Status: Draft

v0.4

6.1 Learning Objectives

After reading this chapter, you will be able to:

  • Understand SCMs as collections of structural assignments with exogenous noise
  • Recognise the modularity principle: mechanisms can be swapped under intervention
  • See how stochastic models fit cleanly into SCM semantics via explicit noise variables
  • Write down SCMs for simple dynamical systems
  • Understand how SCMs encode directed dependencies and mechanisms

6.2 Introduction

This chapter introduces Structural Causal Models (SCMs) as executable mechanisms that encode directed dependencies between variables. SCMs provide the mathematical framework for representing causal mechanisms (Pearl 2009; Peters et al. 2017), building on the graph structure established in Graph Theory and Causal Patterns.

6.3 What Is an SCM?

An SCM is a tuple \(\mathcal{M} = (G, U, F, P(U))\) where (Pearl 2009; Peters et al. 2017):

  • \(G\): Directed acyclic graph (DAG) representing causal structure (a set of directed edges / dyads)
  • \(U\): Exogenous (unobserved) variables capturing unmodelled variation and noise
  • \(F\): Structural assignments (functions) encoding the causal mechanisms
  • \(P(U)\): Distribution over exogenous variables

The graph \(G\) encodes which directed dependencies exist, while the assignments \(F\) encode how parent variables and exogenous noise determine each variable.

6.4 Structural Assignments

Each endogenous variable \(X_i\) is assigned via a structural equation:

\[ X_i \coloneqq f_i(\text{Pa}(X_i), U_i) \]

where \(\text{Pa}(X_i)\) are the parents of \(X_i\) in \(G\).

The parent set \(\text{Pa}(X_i)\) specifies which variables directly enter the mechanism for \(X_i\). The graph \(G\) encodes this dependency structure, and \(f_i\) specifies the mechanism that maps parents and exogenous noise into \(X_i\).

6.5 Structural Equations as Mechanisms

Each structural equation encodes a causal mechanism:

  • Parents \(\text{Pa}(X_i)\) (direct causes under the assumed graph)
  • Exogenous noise \(U_i\) (unmodelled variation, stochasticity, or latent inputs)

The function \(f_i\) specifies how parent values and noise combine to generate \(X_i\).

6.5.1 Implementation: Defining an SCM

We can represent an SCM using CausalDynamics.jl and Graphs.jl. Here’s a simple example:

# Find project root and include ensure_packages.jl
project_root = let
    current = pwd()
    while !isfile(joinpath(current, "Project.toml")) && !isfile(joinpath(current, "_quarto.yml"))
        parent = dirname(current)
        parent == current && break
        current = parent
    end
    current
end
include(joinpath(project_root, "scripts", "ensure_packages.jl"))
@auto_using DAGMakie CairoMakie CausalDynamics Graphs

# Example SCM: X β†’ Y ← Z (Y is a function of X and Z)
# Graph structure
g = SimpleDiGraph(3)
add_edge!(g, 1, 2)  # X β†’ Y
add_edge!(g, 3, 2)  # Z β†’ Y

# Structural equations:
# X := U_X (exogenous)
# Z := U_Z (exogenous)
# Y := f(X, Z, U_Y) = 2*X + 3*Z + U_Y

# In practice, we represent this as:
# - Graph G: encodes which variables directly affect which others
# - Structural functions F: encode the mechanisms
# - Exogenous distributions P(U): encode stochastic inputs/noise

# Visualise the graph
let
    fig, ax, p = dagplot(g;
        figure_size = (600, 400),
        layout_mode = :acyclic,
        nlabels = ["X", "Y", "Z"]
    )
    fig  # Only this gets displayed
end

# Structural equations (conceptual representation)
println("Structural equations:")
println("  X := U_X")
println("  Z := U_Z")
println("  Y := 2*X + 3*Z + U_Y")
println("\nGraph structure encodes:")
println("  - X has no parents (exogenous)")
println("  - Z has no parents (exogenous)")
println("  - Y has parents {X, Z}")
   Resolving package versions...
    Updating `~/Documents/Work/CDCS/Project.toml`
  [a1b2c3d4] + CausalDynamics v0.1.0 `~/Documents/Work/CDCS/packages/CausalDynamics.jl`
    Updating `~/Documents/Work/CDCS/Manifest.toml`
  [a4c015fc] + ANSIColoredPrinters v0.0.1
  [8e7c35d0] + BlockArrays v1.9.3
  [a1b2c3d4] + CausalDynamics v0.1.0 `~/Documents/Work/CDCS/packages/CausalDynamics.jl`
βŒ… [a80b9123] + CommonMark v0.10.3
  [b152e2b5] + CompositeTypes v0.1.4
  [e30172f5] + Documenter v1.17.0
βŒ… [5b8099bc] + DomainSets v0.7.18
  [06fc5a27] + DynamicQuantities v1.12.1
  [64ca27bc] + FindFirstFunctions v1.8.0
  [a85aefff] + FunctionMaps v0.1.2
  [d7ba0133] + Git v1.5.0
  [c27321d9] + Glob v1.4.0
  [b5f81e59] + IOCapture v1.0.0
βŒƒ [3263718b] + ImplicitDiscreteSolve v1.5.0
  [98e50ef6] + JuliaFormatter v2.3.0
βŒ… [70703baa] + JuliaSyntax v0.4.10
  [23fbe1c1] + Latexify v0.16.10
  [0e77f7df] + LazilyInitializedFields v1.3.0
  [d0879d2d] + MarkdownAST v0.1.3
βŒƒ [961ee093] + ModelingToolkit v10.32.1
  [2792f1a3] + RegistryInstances v0.1.0
  [9dfe8606] + SCCNonlinearSolve v1.12.0
βŒ… [19f23fe9] + SymbolicLimits v0.2.3
βŒ… [0c5d862f] + Symbolics v6.58.0
  [1c621080] + TestItems v1.0.0
  [410a4b4d] + Tricks v0.1.13
  [5c2747f8] + URIs v1.6.1
  [61579ee1] + Ghostscript_jll v9.55.1+0
  [020c3dae] + Git_LFS_jll v3.7.0+0
  [f8c6e375] + Git_jll v2.53.0+0
  [9bd350c2] + OpenSSH_jll v10.2.1+0
        Info Packages marked with βŒƒ and βŒ… have new versions available. Those with βŒƒ may be upgradable, but those with βŒ… are restricted by compatibility constraints from upgrading. To see why use `status --outdated -m`
Precompiling packages...
    942.6 ms  βœ“ DynamicQuantities β†’ DynamicQuantitiesSciMLBaseExt
    956.0 ms  βœ“ SCCNonlinearSolve
   1235.5 ms  βœ“ DiffEqBase β†’ DiffEqBaseDynamicQuantitiesExt
   1970.9 ms  βœ“ DomainSets
   1478.1 ms  βœ“ ImplicitDiscreteSolve
   1134.0 ms  βœ“ SCCNonlinearSolve β†’ SCCNonlinearSolveChainRulesCoreExt
   2288.2 ms  βœ“ SymbolicLimits
    935.2 ms  βœ“ DomainSets β†’ DomainSetsRandomExt
  18349.7 ms  βœ“ Symbolics
   2743.2 ms  βœ“ DifferentiationInterface β†’ DifferentiationInterfaceSymbolicsExt
   2960.1 ms  βœ“ Symbolics β†’ SymbolicsForwardDiffExt
   2526.7 ms  βœ“ Symbolics β†’ SymbolicsPreallocationToolsExt
  76731.8 ms  βœ“ ModelingToolkit
   6892.0 ms  βœ“ CausalDynamics
  14 dependencies successfully precompiled in 109 seconds. 312 already precompiled.
Precompiling packages...
   4610.5 ms  βœ“ DomainSets β†’ DomainSetsMakieExt
  1 dependency successfully precompiled in 5 seconds. 271 already precompiled.
Precompiling packages...
   4673.6 ms  βœ“ Makie β†’ MakieDynamicQuantitiesExt
  1 dependency successfully precompiled in 5 seconds. 274 already precompiled.
Precompiling packages...
   9648.5 ms  βœ“ DAGMakie β†’ DAGMakieCausalDynamicsExt
  1 dependency successfully precompiled in 11 seconds. 487 already precompiled.
Structural equations:
  X := U_X
  Z := U_Z
  Y := 2*X + 3*Z + U_Y

Graph structure encodes:
  - X has no parents (exogenous)
  - Z has no parents (exogenous)
  - Y has parents {X, Z}

6.6 Modularity Principle

Key insight: Under intervention \(do(X_i = x)\), we replace the assignment for \(X_i\):

\[ X_i \coloneqq x \quad \text{(instead of } f_i(\text{Pa}(X_i), U_i) \text{)} \]

All other mechanisms remain unchanged. This is the modularity of causal mechanisms (Pearl 2009). The intervention modifies how \(X_i\) is determined, but doesn’t affect how other variables are determinedβ€”future occasions can still prehend \(X_i\), but \(X_i\) no longer prehends its natural parents.

Intervention semantics: When we intervene \(do(X_i = x)\), we modify the data-generating process by fixing \(X_i\) to a specific value. This replaces the original assignment for \(X_i\) with a constant.

6.6.1 Implementation: Modularity Under Intervention

The modularity principle means that under intervention, we modify only the structural equation for the intervened variable, leaving all other mechanisms unchanged:

# Find project root and include ensure_packages.jl
project_root = let
    current = pwd()
    while !isfile(joinpath(current, "Project.toml")) && !isfile(joinpath(current, "_quarto.yml"))
        parent = dirname(current)
        parent == current && break
        current = parent
    end
    current
end
include(joinpath(project_root, "scripts", "ensure_packages.jl"))

@auto_using Graphs

# Original SCM: X β†’ Y ← Z
# Structural equations:
#   X := U_X
#   Z := U_Z
#   Y := 2*X + 3*Z + U_Y

# Under intervention do(X = 5):
#   X := 5  (modified: no longer depends on U_X)
#   Z := U_Z  (unchanged)
#   Y := 2*X + 3*Z + U_Y  (unchanged, but now X is fixed to 5)

println("Original SCM:")
println("  X := U_X")
println("  Z := U_Z")
println("  Y := 2*X + 3*Z + U_Y")

println("\nUnder intervention do(X = 5):")
println("  X := 5  (MODIFIED: no longer depends on U_X)")
println("  Z := U_Z  (UNCHANGED)")
println("  Y := 2*5 + 3*Z + U_Y = 10 + 3*Z + U_Y  (UNCHANGED mechanism, but X is fixed)")

println("\nKey insight: Only the equation for X changed.")
println("The equation for Y remains the sameβ€”it still prehends X and Z,")
println("but now X is fixed to 5 rather than being determined by U_X.")
Original SCM:
  X := U_X
  Z := U_Z
  Y := 2*X + 3*Z + U_Y

Under intervention do(X = 5):
  X := 5  (MODIFIED: no longer depends on U_X)
  Z := U_Z  (UNCHANGED)
  Y := 2*5 + 3*Z + U_Y = 10 + 3*Z + U_Y  (UNCHANGED mechanism, but X is fixed)

Key insight: Only the equation for X changed.
The equation for Y remains the sameβ€”it still prehends X and Z,
but now X is fixed to 5 rather than being determined by U_X.

6.7 Linear SCMs and Matrix Representations

When structural equations are linear, they can be written in matrix form, connecting graph theory, linear algebra, and structural equations:

\[ \mathbf{X}_{t+1} = F \mathbf{X}_t + G \mathbf{A}_t + \mathbf{U}_{t+1} \]

where: - \(F\) is the transition matrix encoding how occasions prehend their predecessors - \(G\) is the control matrix encoding how actions affect occasions - \(\mathbf{U}_{t+1}\) is the exogenous noise

Sparse Dependencies: In most real systems, each variable depends on only a small subset of others. This means the transition matrix \(F\) is often sparse (most entries are zero). Sparse matrix representations enable efficient computation for large systems.

Edge Structure and Matrix Sparsity: The graph \(G\) determines the sparsity pattern of \(F\). If there’s no edge from \(X_i\) to \(X_j\) in the graph, then \(F_{ji} = 0\). The adjacency matrix and the transition matrix share the same sparsity pattern.

6.7.1 Implementation: Linear SCMs and Sparse Matrices

For linear SCMs, we can represent the system in matrix form and demonstrate the connection between graph structure and matrix sparsity:

# Find project root and include ensure_packages.jl
project_root = let
    current = pwd()
    while !isfile(joinpath(current, "Project.toml")) && !isfile(joinpath(current, "_quarto.yml"))
        parent = dirname(current)
        parent == current && break
        current = parent
    end
    current
end
include(joinpath(project_root, "scripts", "ensure_packages.jl"))

@auto_using Graphs SparseArrays

# Example: Linear SCM with graph X₁ β†’ Xβ‚‚ β†’ X₃, X₁ β†’ X₃
# Structural equations:
#   X₁ := U₁
#   Xβ‚‚ := 0.5*X₁ + Uβ‚‚
#   X₃ := 0.3*X₁ + 0.7*Xβ‚‚ + U₃

# Graph structure
g = SimpleDiGraph(3)
add_edge!(g, 1, 2)  # X₁ β†’ Xβ‚‚
add_edge!(g, 2, 3)  # Xβ‚‚ β†’ X₃
add_edge!(g, 1, 3)  # X₁ β†’ X₃

# Adjacency matrix (shows which edges exist)
A = adjacency_matrix(g)
println("Adjacency matrix (graph structure):")
println(A)

# Transition matrix F (shows how occasions prehend their predecessors)
# F[i,j] = coefficient of X_j in equation for X_i
F = [0.0  0.0  0.0;   # X₁ has no parents
     0.5  0.0  0.0;   # Xβ‚‚ prehends X₁ with coefficient 0.5
     0.3  0.7  0.0]   # X₃ prehends X₁ (0.3) and Xβ‚‚ (0.7)

println("\nTransition matrix F (structural equations):")
println(F)

# The sparsity pattern of F matches the adjacency matrix
# Non-zero entries in F correspond to edges in the graph
println("\nSparsity pattern:")
println("  Edge X₁ β†’ Xβ‚‚ exists: ", has_edge(g, 1, 2), " β†’ F[2,1] = ", F[2,1], " (non-zero)")
println("  Edge Xβ‚‚ β†’ X₃ exists: ", has_edge(g, 2, 3), " β†’ F[3,2] = ", F[3,2], " (non-zero)")
println("  Edge X₁ β†’ X₃ exists: ", has_edge(g, 1, 3), " β†’ F[3,1] = ", F[3,1], " (non-zero)")
println("  No edge Xβ‚‚ β†’ X₁: ", !has_edge(g, 2, 1), " β†’ F[1,2] = ", F[1,2], " (zero)")

# Convert to sparse matrix for efficiency
F_sparse = sparse(F)
println("\nSparse representation:")
println("  Non-zero entries: ", nnz(F_sparse), " out of ", length(F_sparse))
println("  Memory savings for large systems: significant")
Adjacency matrix (graph structure):
sparse([1, 1, 2], [2, 3, 3], [1, 1, 1], 3, 3)

Transition matrix F (structural equations):
[0.0 0.0 0.0; 0.5 0.0 0.0; 0.3 0.7 0.0]

Sparsity pattern:
  Edge X₁ β†’ Xβ‚‚ exists: true β†’ F[2,1] = 0.5 (non-zero)
  Edge Xβ‚‚ β†’ X₃ exists: true β†’ F[3,2] = 0.7 (non-zero)
  Edge X₁ β†’ X₃ exists: true β†’ F[3,1] = 0.3 (non-zero)
  No edge Xβ‚‚ β†’ X₁: true β†’ F[1,2] = 0.0 (zero)

Sparse representation:
  Non-zero entries: 3 out of 9
  Memory savings for large systems: significant

6.8 World Context

This chapter addresses Doing in the Structural layer: how do we represent interventions as changes to a data-generating process while keeping other mechanisms unchanged? SCMs provide the executable framework for interventions, and the modularity principle makes that β€œlocal change” semantics explicit.

6.9 Key Takeaways

  1. SCMs as executable mechanisms: SCMs encode causal mechanisms as structural assignments
  2. Structural equations as mechanisms: Each equation specifies how parents and noise generate a variable
  3. Modularity principle: Interventions modify specific mechanisms while others remain unchanged
  4. Linear SCMs: Matrix representations connect graph theory, linear algebra, and structural equations
  5. Sparse dependencies: Most possible edges are absent in realistic models

6.10 Further Reading