# Find project root and include ensure_packages.jl
project_root = let
current = pwd()
while !isfile(joinpath(current, "Project.toml")) && !isfile(joinpath(current, "_quarto.yml"))
parent = dirname(current)
parent == current && break
current = parent
end
current
end
include(joinpath(project_root, "scripts", "ensure_packages.jl"))
@auto_using DAGMakie CairoMakie Graphs
# Pattern 1: Chain A β B β C
g_chain = SimpleDiGraph(3)
add_edge!(g_chain, 1, 2) # A β B
add_edge!(g_chain, 2, 3) # B β C
# Pattern 2: Fork A β B β C
g_fork = SimpleDiGraph(3)
add_edge!(g_fork, 2, 1) # B β A
add_edge!(g_fork, 2, 3) # B β C
# Pattern 3: Collider A β B β C
g_collider = SimpleDiGraph(3)
add_edge!(g_collider, 1, 2) # A β B
add_edge!(g_collider, 3, 2) # C β B
# Visualise all three patterns
let
fig = Figure(size = (1200, 400))
ax1 = Axis(fig[1, 1], title = "Chain")
ax2 = Axis(fig[1, 2], title = "Fork")
ax3 = Axis(fig[1, 3], title = "Collider")
# Chain: A β B β C
dagplot!(ax1, g_chain,
layout_mode = :acyclic,
node_color = :lightblue,
nlabels = ["A", "B", "C"]
)
# Fork: A β B β C
dagplot!(ax2, g_fork,
layout_mode = :acyclic,
node_color = :lightgreen,
nlabels = ["A", "B", "C"]
)
# Collider: A β B β C
dagplot!(ax3, g_collider,
layout_mode = :acyclic,
node_color = :lightcoral,
nlabels = ["A", "B", "C"]
)
fig # Only this gets displayed
end5 Graph Theory and Causal Patterns
v0.4
5.1 Learning Objectives
After reading this chapter, you will be able to:
- Understand graph theory as the mathematical representation of directed dependencies
- Identify the three fundamental causal patterns (chains, forks, colliders)
- Apply d-separation to connect graph structure to conditional independence
- Understand sparse matrices as the computational bridge
- Use the Markov boundary to determine minimal sufficient sets
5.2 Introduction
This chapter establishes graph theory as the foundation of the Structural layer: how we represent causal structure as a directed graph. We build from the dyad (Chapter 2) to complex graph structures, showing how much causal reasoning can be reduced to properties of paths and conditioning sets.
Key principle: All complex structures are combinations of dyads. The graph structure \(G = (V, E)\) is composed entirely of dyads (two nodes connected by one directed edge). Complex causal structures emerge from how dyads connect and overlap.
5.3 Graph Theory and Directed Dependencies
5.3.1 Building from Dyads
As established in The Primary Unit: The Dyad, a dyad consists of two nodes connected by a single directed edge: \(X \rightarrow Y\). This is the minimal unit of directed dependence in a causal graph.
Key principles:
- One dyad = one directed dependence
- The dyad encodes a local mechanism, typically written as \(Y \coloneqq f(X, U)\)
- Complex structures are combinations of dyads
- Graph structure = collection of dyads (two nodes, one edge each)
5.3.2 Graphs as Structural Structure
In a structural causal model, we have \(\mathcal{M} = (G, U, F, P(U))\) where \(G\) is a directed acyclic graph (DAG) representing causal structure (Pearl 2009; Spirtes et al. 2000). Formally, a graph is defined as \(G = (V, E)\) where:
- \(V\) is the set of vertices (nodes / variables)
- \(E\) is the set of edges (directed pairs) encoding direct dependencies assumed by the model
Each edge in \(E\) represents a dyad. The graph structure emerges from how these dyads connect.
5.3.3 Why Directed Acyclic Graphs?
DAGs are natural in many causal problems because:
- Direction: causal influence is directional. An edge \(A \rightarrow B\) means \(A\) is a parent of \(B\).
- Acyclicity: acyclicity rules out directed feedback within the structural graph (useful for many identification results).
- Parent sets: the parent set \(\text{Pa}(X_i)\) lists the direct inputs to the mechanism for \(X_i\).
5.3.4 The Three Fundamental Causal Patterns
The three fundamental patterns in causal graphs are all combinations of two dyads:
5.3.4.1 Chain: Two Sequential Dyads
Pattern: \(A \rightarrow B \rightarrow C\) (two sequential dyads)
- Dyad 1: \(A \rightarrow B\)
- Dyad 2: \(B \rightarrow C\)
Meaning: A mediating chain. This creates a directed path of influence.
d-separation: \(A\) and \(C\) are dependent marginally, but independent conditional on \(B\) (the mediator blocks the path).
5.3.5 Implementation: Visualising the Three Fundamental Patterns
We can visualise and verify the three fundamental patterns using Graphs.jl and GraphMakie:
5.3.5.1 Fork: Two Dyads from Common Source
Pattern: \(A \leftarrow B \rightarrow C\) (two dyads from common source)
- Dyad 1: \(B \rightarrow A\)
- Dyad 2: \(B \rightarrow C\)
Meaning: Common cause.
d-separation: \(A\) and \(C\) are dependent marginally (they share a common cause \(B\)), but independent conditional on \(B\) (conditioning on the common cause blocks the path).
5.3.5.2 Collider: Two Dyads Converging
Pattern: \(A \rightarrow B \leftarrow C\) (two dyads converging)
- Dyad 1: \(A \rightarrow B\)
- Dyad 2: \(C \rightarrow B\)
Meaning: Convergence (a collider).
d-separation: \(A\) and \(C\) are independent marginally (no direct connection), but dependent conditional on \(B\) (conditioning on the collider opens the pathβthe βexplaining awayβ pattern).
5.3.6 d-Separation: Connecting Graph Structure to Conditional Independence
d-separation is a graph-theoretic criterion that connects graph structure to conditional independence (Pearl 2009). Two nodes \(X\) and \(Y\) are d-separated by a set \(Z\) if all paths between \(X\) and \(Y\) are blocked by \(Z\).
Path blocking rules (based on the three patterns):
- Chain: \(A \rightarrow B \rightarrow C\): Path is blocked if \(B \in Z\)
- Fork: \(A \leftarrow B \rightarrow C\): Path is blocked if \(B \in Z\)
- Collider: \(A \rightarrow B \leftarrow C\): Path is blocked if \(B \notin Z\) and no descendant of \(B\) is in \(Z\)
Why d-separation matters: If \(X\) and \(Y\) are d-separated by \(Z\) in the graph, then \(X β«« Y \mid Z\) in any probability distribution compatible with the graph structure.
5.3.7 Implementation: d-Separation Examples
We can verify d-separation properties using CausalDynamics.jl:
# Find project root and include ensure_packages.jl
project_root = let
current = pwd()
while !isfile(joinpath(current, "Project.toml")) && !isfile(joinpath(current, "_quarto.yml"))
parent = dirname(current)
parent == current && break
current = parent
end
current
end
include(joinpath(project_root, "scripts", "ensure_packages.jl"))
@auto_using CausalDynamics Graphs
# Example 1: Chain A β B β C
# A and C are d-separated by B (mediator blocks the path)
g_chain = SimpleDiGraph(3)
add_edge!(g_chain, 1, 2) # A β B
add_edge!(g_chain, 2, 3) # B β C
println("Chain pattern: A β B β C")
println("A β«« C | B: ", CausalDynamics.d_separated(g_chain, 1, 3, [2])) # true: blocked by B
println("A β«« C: ", CausalDynamics.d_separated(g_chain, 1, 3, [])) # false: path exists when not conditioning
# Example 2: Fork A β B β C
# A and C are d-separated by B (common cause blocks the path)
g_fork = SimpleDiGraph(3)
add_edge!(g_fork, 2, 1) # B β A
add_edge!(g_fork, 2, 3) # B β C
println("\nFork pattern: A β B β C")
println("A β«« C | B: ", CausalDynamics.d_separated(g_fork, 1, 3, [2])) # true: blocked by B
println("A β«« C: ", CausalDynamics.d_separated(g_fork, 1, 3, [])) # false: path exists when not conditioning
# Example 3: Collider A β B β C
# A and C are independent marginally, but dependent when conditioning on B
g_collider = SimpleDiGraph(3)
add_edge!(g_collider, 1, 2) # A β B
add_edge!(g_collider, 3, 2) # C β B
println("\nCollider pattern: A β B β C")
println("A β«« C: ", CausalDynamics.d_separated(g_collider, 1, 3, [])) # true: no path (collider blocks)
println("A β«« C | B: ", CausalDynamics.d_separated(g_collider, 1, 3, [2])) # false: conditioning on collider opens path
# Example 4: More complex graph
# Z β X β Y β W, with Z β W
# Check d-separation of X and Y given different sets
g_complex = SimpleDiGraph(4)
add_edge!(g_complex, 1, 2) # Z β X
add_edge!(g_complex, 2, 3) # X β Y
add_edge!(g_complex, 4, 3) # W β Y
add_edge!(g_complex, 1, 4) # Z β W
println("\nComplex graph: Z β X β Y β W, Z β W")
println("X β«« Y: ", CausalDynamics.d_separated(g_complex, 2, 3, [])) # false: direct path X β Y
println("X β«« Y | W: ", CausalDynamics.d_separated(g_complex, 2, 3, [4])) # false: still direct path
println("Z β«« Y | X: ", CausalDynamics.d_separated(g_complex, 1, 3, [2])) # true: X blocks path Z β X β YResolving package versions... Installed SCCNonlinearSolve β v1.12.0 Installed FunctionMaps ββββββ v0.1.2 Installed DynamicQuantities β v1.12.1 Installed DomainSets ββββββββ v0.7.18 Updating `~/Documents/Work/CDCS/Project.toml` [a1b2c3d4] + CausalDynamics v0.1.0 `~/Documents/Work/CDCS/packages/CausalDynamics.jl` Updating `~/Documents/Work/CDCS/Manifest.toml` [a4c015fc] + ANSIColoredPrinters v0.0.1 [8e7c35d0] + BlockArrays v1.9.3 [a1b2c3d4] + CausalDynamics v0.1.0 `~/Documents/Work/CDCS/packages/CausalDynamics.jl` β [a80b9123] + CommonMark v0.10.3 [b152e2b5] + CompositeTypes v0.1.4 [e30172f5] + Documenter v1.17.0 β [5b8099bc] + DomainSets v0.7.18 [06fc5a27] + DynamicQuantities v1.12.1 [64ca27bc] + FindFirstFunctions v1.8.0 [a85aefff] + FunctionMaps v0.1.2 [d7ba0133] + Git v1.5.0 [c27321d9] + Glob v1.4.0 [b5f81e59] + IOCapture v1.0.0 β [3263718b] + ImplicitDiscreteSolve v1.5.0 [98e50ef6] + JuliaFormatter v2.3.0 β [70703baa] + JuliaSyntax v0.4.10 [23fbe1c1] + Latexify v0.16.10 [0e77f7df] + LazilyInitializedFields v1.3.0 [d0879d2d] + MarkdownAST v0.1.3 β [961ee093] + ModelingToolkit v10.32.1 [2792f1a3] + RegistryInstances v0.1.0 [9dfe8606] + SCCNonlinearSolve v1.12.0 β [19f23fe9] + SymbolicLimits v0.2.3 β [0c5d862f] + Symbolics v6.58.0 [1c621080] + TestItems v1.0.0 [410a4b4d] + Tricks v0.1.13 [5c2747f8] + URIs v1.6.1 [61579ee1] + Ghostscript_jll v9.55.1+0 [020c3dae] + Git_LFS_jll v3.7.0+0 [f8c6e375] + Git_jll v2.53.0+0 [9bd350c2] + OpenSSH_jll v10.2.1+0 Info Packages marked with β and β have new versions available. Those with β may be upgradable, but those with β are restricted by compatibility constraints from upgrading. To see why use `status --outdated -m` Precompiling packages... 1061.0 ms β SciMLBase β SciMLBaseMLStyleExt 1202.6 ms β FunctionMaps 1253.3 ms β Distributions β DistributionsTestExt 1276.6 ms β SCCNonlinearSolve 1837.6 ms β ImplicitDiscreteSolve 1192.6 ms β SCCNonlinearSolve β SCCNonlinearSolveChainRulesCoreExt 2837.6 ms β DiffEqNoiseProcess 1788.6 ms β DomainSets 3608.6 ms β DiffEqCallbacks 896.4 ms β DomainSets β DomainSetsRandomExt 5167.5 ms β DynamicQuantities 974.5 ms β DynamicQuantities β DynamicQuantitiesRecursiveArrayToolsExt 1072.9 ms β DynamicQuantities β DynamicQuantitiesLinearAlgebraExt 1253.1 ms β DynamicQuantities β DynamicQuantitiesUnitfulExt 1235.6 ms β DiffEqBase β DiffEqBaseDynamicQuantitiesExt 1381.9 ms β DynamicQuantities β DynamicQuantitiesSciMLBaseExt 2983.2 ms β JumpProcesses 17032.5 ms β SymbolicUtils 2491.0 ms β SymbolicLimits 19294.9 ms β Symbolics 2967.8 ms β DifferentiationInterface β DifferentiationInterfaceSymbolicsExt 3191.5 ms β Symbolics β SymbolicsForwardDiffExt 2573.1 ms β Symbolics β SymbolicsPreallocationToolsExt 79291.8 ms β ModelingToolkit 7208.7 ms β CausalDynamics 25 dependencies successfully precompiled in 130 seconds. 301 already precompiled. Precompiling packages... 4841.5 ms β DomainSets β DomainSetsMakieExt 1 dependency successfully precompiled in 6 seconds. 269 already precompiled. Precompiling packages... 4696.1 ms β Makie β MakieDynamicQuantitiesExt 1 dependency successfully precompiled in 5 seconds. 272 already precompiled. Precompiling packages... 9897.8 ms β DAGMakie β DAGMakieCausalDynamicsExt 1 dependency successfully precompiled in 11 seconds. 485 already precompiled. Chain pattern: A β B β C A β«« C | B: true A β«« C: false Fork pattern: A β B β C A β«« C | B: true A β«« C: false Collider pattern: A β B β C A β«« C: true A β«« C | B: false Complex graph: Z β X β Y β W, Z β W X β«« Y: false X β«« Y | W: false Z β«« Y | X: false
5.3.8 Sparse Matrices: The Computational Bridge
The connection between graph theory, linear algebra, and structural equations becomes concrete through sparse matrices. Every directed graph \(G\) can be represented as an adjacency matrix \(A\) where \(A_{ij} = 1\) if thereβs an edge from \(i\) to \(j\), and \(A_{ij} = 0\) otherwise.
In most real systems, variables depend directly on only a small subset of other variables. This sparsity means the adjacency matrix is sparse, with most entries equal to zero.
When structural equations are linear, they can be written in matrix form:
\[ \mathbf{X}_{t+1} = F \mathbf{X}_t + G \mathbf{A}_t + \mathbf{U}_{t+1} \]
where \(F\) is the transition matrix encoding which state components influence which others. Sparsity of dependencies means \(F\) is sparse.
The Three-Way Connection: Graph theory provides the structure (\(G\) encodes which variables influence which others), linear algebra provides the computation (matrix operations propagate influence), and sparse matrices provide the bridge (efficient representation and computation).
5.3.9 Implementation: Sparse Matrix Representation
We can demonstrate the connection between graphs and sparse matrices:
# Find project root and include ensure_packages.jl
project_root = let
current = pwd()
while !isfile(joinpath(current, "Project.toml")) && !isfile(joinpath(current, "_quarto.yml"))
parent = dirname(current)
parent == current && break
current = parent
end
current
end
include(joinpath(project_root, "scripts", "ensure_packages.jl"))
@auto_using Graphs SparseArrays
# Create a graph: Xβ β Xβ β Xβ, with Xβ β Xβ
g = SimpleDiGraph(3)
add_edge!(g, 1, 2) # Xβ β Xβ
add_edge!(g, 2, 3) # Xβ β Xβ
add_edge!(g, 1, 3) # Xβ β Xβ
# Convert to adjacency matrix (dense)
A_dense = adjacency_matrix(g)
println("Dense adjacency matrix:")
println(A_dense)
# Convert to sparse matrix
A_sparse = sparse(A_dense)
println("\nSparse adjacency matrix:")
println(A_sparse)
println("\nSparsity: ", nnz(A_sparse), " non-zero entries out of ", length(A_sparse), " total")
println("Sparsity ratio: ", 1 - nnz(A_sparse) / length(A_sparse))
# For a larger graph, sparse representation is much more efficient
# Example: 100-node graph with 200 edges (sparse)
g_large = SimpleDiGraph(100)
# Add 200 random edges
for _ in 1:200
src = rand(1:100)
dst = rand(1:100)
if src != dst && !has_edge(g_large, src, dst)
add_edge!(g_large, src, dst)
end
end
A_large_dense = adjacency_matrix(g_large)
A_large_sparse = sparse(A_large_dense)
println("\nLarge graph (100 nodes, 200 edges):")
println("Dense matrix size: ", sizeof(A_large_dense), " bytes")
println("Sparse matrix size: ", sizeof(A_large_sparse), " bytes")
println("Memory savings: ", round((1 - sizeof(A_large_sparse) / sizeof(A_large_dense)) * 100, digits=1), "%")Dense adjacency matrix:
sparse([1, 1, 2], [2, 3, 3], [1, 1, 1], 3, 3)
Sparse adjacency matrix:
sparse([1, 1, 2], [2, 3, 3], [1, 1, 1], 3, 3)
Sparsity: 3 non-zero entries out of 9 total
Sparsity ratio: 0.6666666666666667
Large graph (100 nodes, 200 edges):
Dense matrix size: 40 bytes
Sparse matrix size: 40 bytes
Memory savings: 0.0%
5.3.10 Markov Boundary: The Minimal Sufficient Set
The Markov boundary of a node \(X\) is the minimal set of nodes that makes \(X\) conditionally independent of all other nodes (Pearl 2009). It consists of:
- Parents of \(X\): Direct causes
- Children of \(X\): Direct effects
- Other parents of children: Spouses (other direct causes of \(X\)βs children)
The Markov boundary provides a principled criterion for determining the minimal set of variables needed for causal reasoning about \(X\).
5.3.11 Implementation: Computing Markov Boundary
We can compute the Markov boundary using graph structure. The Markov boundary of a node consists of its parents, children, and other parents of children (spouses):
# Find project root and include ensure_packages.jl
project_root = let
current = pwd()
while !isfile(joinpath(current, "Project.toml")) && !isfile(joinpath(current, "_quarto.yml"))
parent = dirname(current)
parent == current && break
current = parent
end
current
end
include(joinpath(project_root, "scripts", "ensure_packages.jl"))
@auto_using DAGMakie CairoMakie Graphs
# Example graph: Z β X β Y β W, with Z β W
# For node X, Markov boundary should include:
# - Parents: Z
# - Children: Y
# - Spouses: W (other parent of Y)
g = SimpleDiGraph(4)
add_edge!(g, 1, 2) # Z β X
add_edge!(g, 2, 3) # X β Y
add_edge!(g, 4, 3) # W β Y
add_edge!(g, 1, 4) # Z β W
function markov_boundary(g::AbstractGraph, node::Integer)
"""
Compute the Markov boundary of a node in a directed graph.
The Markov boundary consists of:
- Parents: nodes with edges into the node
- Children: nodes with edges from the node
- Spouses: other parents of the node's children
"""
# Parents: nodes with edges into node
parents = Vector{Int}()
for src in 1:nv(g)
if has_edge(g, src, node)
push!(parents, src)
end
end
# Children: nodes with edges from node
children = Vector{Int}()
for dst in 1:nv(g)
if has_edge(g, node, dst)
push!(children, dst)
end
end
# Spouses: other parents of children
spouses = Set{Int}()
for child in children
for parent in 1:nv(g)
if has_edge(g, parent, child) && parent != node
push!(spouses, parent)
end
end
end
return (parents=parents, children=children, spouses=collect(spouses),
boundary=sort(unique([parents; children; collect(spouses)])))
end
# Compute Markov boundary for X (node 2)
mb = markov_boundary(g, 2)
println("Markov boundary for X (node 2):")
println(" Parents: ", mb.parents) # [1] = Z
println(" Children: ", mb.children) # [3] = Y
println(" Spouses: ", mb.spouses) # [4] = W
println(" Full boundary: ", mb.boundary) # [1, 3, 4] = Z, Y, W
# Visualise graph with Markov boundary highlighted
let
node_colors = [:lightblue, :yellow, :lightgreen, :lightgreen] # X highlighted
fig, ax, p = dagplot(g;
figure_size = (800, 600),
layout_mode = :acyclic,
node_color = node_colors,
nlabels = ["Z", "X", "Y", "W"]
)
fig # Only this gets displayed
endMarkov boundary for X (node 2):
Parents: [1]
Children: [3]
Spouses: [4]
Full boundary: [1, 3, 4]
5.4 World Context
This chapter addresses Seeing in the Structural layer: what can we infer about causal structure from conditional independences and graph properties? Graph theory provides the mathematical foundation for representing and reasoning about structure. Many concepts here are abstract and not tied to a particular time index:
- Graph structure: assumed causal structure
- d-separation: implied conditional independences
- Sparse matrices: computational structure induced by sparse dependencies
- Markov boundary: minimal sufficient sets for prediction/adjustment
All complex structures are built from combinations of dyads, and graph theory provides the language for understanding these structures.
5.5 Key Takeaways
- Graph theory as structural representation: Graphs represent collections of dyads (directed dependencies)
- Graph theory as structural representation: Graphs represent collections of dyads (directed dependencies)
- Three fundamental patterns: Chains, forks, and colliders are combinations of two dyads
- d-separation: Connects graph structure to conditional independence
- Sparse matrices: Computational bridge between graph theory and linear algebra
- Markov boundary: Principled criterion for minimal sufficient sets
- All structures are combinations of dyads: The fundamental unit remains the dyad
5.6 Further Reading
- Pearl (2009): Causality β Graph theory and d-separation
- Spirtes et al. (2000): Causation, Prediction, and Search β Graph-theoretic causal discovery
- Structural Causal Models as Executable Mechanisms: How graphs are encoded in SCMs