4 The Primary Unit: The Dyad

Status: Draft

v0.3

4.1 Learning Objectives

After reading this chapter, you will be able to:

Understand the dyad as the fundamental unit of causal structure
Recognise how the dyad (two nodes, one edge) is the minimal unit of directed dependence
See how all complex causal structures are built from combinations of dyads
Connect dyad-first thinking to structural causal models
Apply dyad-first thinking to structural causal models

4.2 Introduction

In causal graphs, the dyad (two nodes connected by a single directed edge) is the minimal unit of structure. This chapter treats the dyad as the primary unit of causal modelling, showing how all complex causal graphs are built from combinations of dyads.

In graph theory and causal inference (Peters et al. 2017), the dyad is the minimal unit of directed dependence and a building block for larger structures (chains, forks, colliders, and networks).

4.3 The Dyad: Two Nodes, One Edge

A dyad consists of two nodes connected by a single directed edge: \(X \rightarrow Y\). This is the smallest structure that encodes a direct dependence of \(Y\) on \(X\).

The dyad = the minimal unit of directed dependence

The dyad \(X \rightarrow Y\) encodes:

Which occasions are in relation: \(X\) and \(Y\)
How they relate: the edge direction \(X \to Y\)
The mechanism: encoded in the structural equation

In structural causal models, the dyad is expressed through a structural equation:

\[ Y \coloneqq f(X, U) \]

where:

\(X\) is the source variable
\(Y\) is the target variable
\(f\) encodes the mechanism mapping inputs to output
\(U\) represents exogenous noise (unmodelled variation)

Intervention semantics: When we intervene \(do(X = x)\), we modify the generating process by fixing \(X\) to a specific value, changing the distribution of \(Y\) through the mechanism.

4.4 Why the Dyad is Fundamental

4.4.1 Why the Dyad is Fundamental

The graph structure \(G = (V, E)\) is composed entirely of dyads. Every edge connects two nodes, making the dyad the basic unit from which more complex causal structure is built.

4.5 Dyad Semantics

4.5.1 Structural Equation

A dyad \(X \rightarrow Y\) is encoded in a structural equation:

\[ Y \coloneqq f(X, U) \]

where:

\(X\) is the source variable
\(Y\) is the target variable
\(f\) encodes how \(X\) influences \(Y\) (the mechanism)
\(U\) is exogenous noise

4.5.2 Intervention on a Dyad

When we intervene \(do(X = x)\), we fix the input:

\[ Y \coloneqq f(x, U) \]

The dyad structure remains (the edge still exists), but the source is now fixed exogenously rather than generated by its usual mechanism.

4.5.3 The Dyad Encodes a Local Mechanism

The dyad \(X \rightarrow Y\) is not merely a connection; it encodes a complete local dependency:

Which variables are in relation: \(X\) and \(Y\) (the two nodes)
How they relate: the edge direction \(X \to Y\)
The mechanism: encoded in the function \(f\)
What novelty is introduced: encoded in the exogenous noise \(U\)

Key insight: The dyad is the fundamental unit: two nodes and a directed edge form a minimal unit of causal structure.

4.5.4 Implementation: Visualising the Dyad

We can visualise the fundamental dyad structure using Graphs.jl and GraphMakie:

# Find project root and include ensure_packages.jl
project_root = let
    current = pwd()
    while !isfile(joinpath(current, "Project.toml")) && !isfile(joinpath(current, "_quarto.yml"))
        parent = dirname(current)
        parent == current && break
        current = parent
    end
    current
end
include(joinpath(project_root, "scripts", "ensure_packages.jl"))
@auto_using DAGMakie CairoMakie Graphs

# The dyad: X → Y (two nodes, one edge)
g_dyad = SimpleDiGraph(2)
add_edge!(g_dyad, 1, 2)  # X → Y

   Resolving package versions...
    Updating `~/Documents/Work/CDCS/Project.toml`
  [d4a9c1e2] + DAGMakie v0.1.0 `~/Documents/Work/CDCS/packages/DAGMakie.jl`
    Updating `~/Documents/Work/CDCS/Manifest.toml`
  [d4a9c1e2] + DAGMakie v0.1.0 `~/Documents/Work/CDCS/packages/DAGMakie.jl`
Precompiling packages...
   5945.5 ms  ✓ QuartoNotebookWorkerMakieExt (serial)
  1 dependency successfully precompiled in 6 seconds
Precompiling packages...
   4881.7 ms  ✓ QuartoNotebookWorkerCairoMakieExt (serial)
  1 dependency successfully precompiled in 5 seconds
Precompiling packages...
   5560.4 ms  ✓ DAGMakie
  1 dependency successfully precompiled in 6 seconds. 273 already precompiled.

true

The dyad: the fundamental unit of causal structure

4.5.5 Nodes Defined by Their Incident Edges

In a process-first ontology, nodes are defined by their incident edges:

A node with no incoming edges is a source (e.g., an exogenous variable)
A node with no outgoing edges is a sink (a final occasion, or an observed variable)
A node with both incoming and outgoing edges is an intermediate occasion (prehending some and prehended by others)

The node’s identity and role emerge from the pattern of edges incident to it.

4.5.6 Implementation: Nodes Defined by Their Incident Edges

We can demonstrate how nodes are defined by their incident edges:

# Find project root and include ensure_packages.jl
project_root = let
    current = pwd()
    while !isfile(joinpath(current, "Project.toml")) && !isfile(joinpath(current, "_quarto.yml"))
        parent = dirname(current)
        parent == current && break
        current = parent
    end
    current
end
include(joinpath(project_root, "scripts", "ensure_packages.jl"))
@auto_using DAGMakie CairoMakie Graphs

# Example: Graph with source, intermediate, and sink nodes
g = SimpleDiGraph(4)
add_edge!(g, 1, 2)  # Source → Intermediate
add_edge!(g, 2, 3)  # Intermediate → Intermediate
add_edge!(g, 3, 4)  # Intermediate → Sink

# Identify node roles
function node_role(g::AbstractGraph, node::Integer)
    """
    Determine node role based on incident edges.

    Returns: "Source", "Sink", "Intermediate", or "Isolated"
    """
    has_incoming = any(has_edge(g, src, node) for src in 1:nv(g))
    has_outgoing = any(has_edge(g, node, dst) for dst in 1:nv(g))

    if !has_incoming && has_outgoing
        return "Source"
    elseif has_incoming && !has_outgoing
        return "Sink"
    elseif has_incoming && has_outgoing
        return "Intermediate"
    else
        return "Isolated"
    end
end

# Visualise with role labels
let
    node_colors = [:yellow, :lightblue, :lightblue, :lightgreen]  # Source, Intermediate, Intermediate, Sink
    node_labels = [string(i, ": ", node_role(g, i)) for i in 1:nv(g)]

    fig, ax, p = dagplot(g,
        figure_size = (800, 400),
        layout_mode = :acyclic,
        node_color = node_colors,
        nlabels = node_labels,
        nlabels_fontsize = 12
    )

    fig  # Only this gets displayed
end

println("Node roles (defined by incident edges):")
for i in 1:nv(g)
    println("  Node ", i, ": ", node_role(g, i))
end

Node roles (defined by incident edges):
  Node 1: Source
  Node 2: Intermediate
  Node 3: Intermediate
  Node 4: Sink

4.6 Building from Dyads

4.6.1 Multiple Dyads = Larger Graph Structure

When we have multiple dyads, we have a larger dependency structure:

Two dyads: \(X \rightarrow Y\) and \(Y \rightarrow Z\) form a chain (a two-step dependency, three nodes)
Two dyads sharing a source: \(X \leftarrow Y \rightarrow Z\) form a fork (one occasion prehended by two others, three nodes)
Two dyads sharing a target: \(X \rightarrow Y \leftarrow Z\) form a collider (two occasions prehending the same target, three nodes)

Each pattern is a combination of dyads, not a fundamental structure in itself. The fundamental unit remains the dyad.

4.6.2 Complex Structures = Combinations of Dyads

A graph \(G = (V, E)\) where:

\(V\) is the set of vertices (nodes / variables)
\(E\) is the set of edges (directed dependencies)

In a dyad-first view: A graph is a collection of dyads (two nodes connected by one directed edge). Each edge in \(E\) represents a dyad, and complex structures emerge from how dyads connect and overlap.

4.6.3 Sparsity of Edges

Most possible dyads don’t exist. In a system with \(n\) nodes, there are \(n(n-1)\) possible directed dyads, but typically only a small fraction of these exist. This sparsity is a common modelling assumption (and empirical regularity) in networked systems, and it is what makes many algorithms and computations scalable.

4.6.4 Implementation: Sparsity of Dyads

We can demonstrate sparsity:

# Find project root and include ensure_packages.jl
project_root = let
    current = pwd()
    while !isfile(joinpath(current, "Project.toml")) && !isfile(joinpath(current, "_quarto.yml"))
        parent = dirname(current)
        parent == current && break
        current = parent
    end
    current
end
include(joinpath(project_root, "scripts", "ensure_packages.jl"))
@auto_using DAGMakie Graphs

# Example: System with n nodes
n = 10

# Total possible directed dyads
total_possible = n * (n - 1)
println("System with ", n, " nodes:")
println("  Total possible directed dyads: ", total_possible)

# Typical sparse structure: each node prehends only a few others
# Example: each node has on average 2 incoming edges
avg_edges_per_node = 2
typical_edges = n * avg_edges_per_node
sparsity_ratio = 1 - typical_edges / total_possible

println("  Typical number of edges: ", typical_edges)
println("  Sparsity ratio: ", round(sparsity_ratio * 100, digits=1), "%")
println("  Most possible dyads (", round(sparsity_ratio * 100, digits=1), "%) don't exist")

# Create a sparse graph example
g_sparse = SimpleDiGraph(n)
# Add a few edges (sparse structure)
for i in 1:n
    # Each node connects to 1-2 others
    targets = rand(1:n, rand(1:2))
    for t in targets
        if t != i && !has_edge(g_sparse, i, t)
            add_edge!(g_sparse, i, t)
        end
    end
end

actual_edges = ne(g_sparse)
actual_sparsity = 1 - actual_edges / total_possible

println("\nExample sparse graph:")
println("  Actual edges: ", actual_edges, " out of ", total_possible, " possible")
println("  Sparsity: ", round(actual_sparsity * 100, digits=1), "%")
println("\nThis sparsity reflects that most variables do not directly depend on most others")

System with 10 nodes:
  Total possible directed dyads: 90
  Typical number of edges: 20
  Sparsity ratio: 77.8%
  Most possible dyads (77.8%) don't exist

Example sparse graph:
  Actual edges: 9 out of 90 possible
  Sparsity: 90.0%

This sparsity reflects that most variables do not directly depend on most others

4.7 Key Takeaways

The dyad is the primary unit: Two nodes connected by one edge form a minimal unit of causal structure
The dyad encodes a local mechanism: The dyad \(X \rightarrow Y\) specifies a direct dependency plus a mechanism and noise term
All complex structures are combinations of dyads: Chains, forks, colliders, and networks are all built from multiple dyads
Dyad-first thinking: Many structural questions can be phrased in terms of edges and local mechanisms
Intervention modifies the generating process: \(do(X = x)\) changes the distribution of \(Y\) by fixing the input
Sparsity of dyads: Most possible dyads don’t exist, and sparsity drives both modelling choices and computational scalability

4.8 Further Reading

Graph Theory and Causal Patterns: How multiple dyads combine to form chains, forks, colliders, and complex structures
The Causal Hierarchy and Three Worlds: The framework that situates the dyad
Introduction: Overview of the modelling layers and causal questions