29 Causal Decision-Making

Status: Draft

v0.4

29.1 Learning Objectives

After reading this chapter, you will be able to:

Understand Markov boundary and Markov blanket as principled criteria for determining CDM scope
Determine which variables (nodes) should be included or excluded in a CDM using Markov boundary
Connect Markov blanket to the Free Energy Principle and CDM model boundaries
Make interventional decisions using causal models
Treat uncertainty as central: robust decisions over model and parameter uncertainty

29.2 Introduction

Decision-making requires answering “what if” questions: what happens under different policies? (Murphy 2003; Schulam and Saria 2017; Tennenholtz et al. 2020) This chapter formalises causal decision-making and shows how to determine CDM scope using Markov boundary and Markov blanket.

A fundamental question in causal decision-making is: which variables (nodes) should be included or excluded in a CDM? This question is answered by the concepts of Markov boundary (introduced in Graph Theory and Causal Patterns) and Markov blanket (introduced in From Structure to Time: FEP and Attractors), which provide principled criteria for determining the minimal set of variables needed for causal reasoning and decision-making.

29.3 Determining CDM Scope: Markov Boundary and Markov Blanket

29.3.1 The Problem: What to Include in a CDM?

When constructing a CDM for decision-making, we must decide which variables (nodes) to include. Including too many variables makes the model unnecessarily complex and computationally expensive. Including too few variables risks omitting important causal relationships, leading to biased policy evaluations.

The Markov boundary (introduced in Graph Theory and Causal Patterns) and Markov blanket (introduced in From Structure to Time: FEP and Attractors) provide complementary, principled answers to this question.

29.3.2 Markov Boundary: Graph-Theoretic Minimal Set

The Markov boundary of a target variable \(Y\) is the minimal set of variables needed for causal reasoning about \(Y\) at all three levels of Reason (Pearl 1988, 2014):

Level 1 (Association): Given \(\text{MB}(Y)\), no other variables provide additional information about \(Y\)
Level 2 (Intervention): The Markov boundary identifies which variables must be included to properly adjust for confounding
Level 3 (Counterfactual): The Markov boundary determines which variables are needed to compute counterfactual outcomes

Inclusion principle: When building a CDM for decision-making about outcome \(Y\), we should include at minimum the Markov boundary of \(Y\). This ensures we have all necessary information while avoiding unnecessary complexity.

29.3.3 Markov Blanket: FEP System Boundary

The Markov blanket (from FEP) defines the boundary between internal states (the system) and external states (the environment) (Friston 2010, 2013). In a CDM:

Internal states (\(\mu\)): Latent process states \(X_t\) (the inner worlds: Structural, Dynamical)
Sensory states (\(s\)): Observations \(Y_t\) (the Observable world)
Active states (\(a\)): Actions/interventions \(A_t\) (policy decisions)
External states (\(\eta\)): Everything outside the CDM (unmodelled parts of the environment)

The conditional independence \(\mu ⫫ \eta \mid (s, a)\) means that internal and external states are independent given the Markov blanket.

Boundary principle: When constructing a CDM, the Markov blanket defines the minimal boundary—we must include sensory states (observations) and active states (actions), and can exclude external states that are conditionally independent given the blanket.

29.4 Integrating Markov Boundary and Markov Blanket

29.4.1 Unified Principle for CDM Construction

When constructing a CDM for causal decision-making, combine both perspectives:

Identify the target variable(s): What outcomes are we interested in? (e.g., \(Y_T\) for policy evaluation)
Determine the Markov boundary: For each target variable, identify its Markov boundary (see Graph Theory and Causal Patterns)—this is the minimal set of variables needed for causal reasoning at all three levels of Reason
Define the Markov blanket: Identify internal states (\(X_t\)), sensory states (\(Y_t\)), and active states (\(A_t\)) (see From Structure to Time: FEP and Attractors)—this defines the system boundary from an FEP perspective
Include the union: Include all variables in the Markov boundary of target variables, plus all variables in the Markov blanket. This ensures:
- We have all necessary information for causal reasoning (Markov boundary—Structural world)
- We have a principled system boundary (Markov blanket—Structural/Dynamical world)
Exclude everything else: Variables not in the Markov boundary (for target variables) and not part of the Markov blanket can be excluded, as they are either irrelevant for causal reasoning or conditionally independent given the blanket

29.4.2 Implementation: Determining CDM Scope

Here’s an example of using Markov boundary and Markov blanket to determine CDM scope:

# Find project root and include ensure_packages.jl
project_root = let
    current = pwd()
    while !isfile(joinpath(current, "Project.toml")) && !isfile(joinpath(current, "_quarto.yml"))
        parent = dirname(current)
        parent == current && break
        current = parent
    end
    current
end
include(joinpath(project_root, "scripts", "ensure_packages.jl"))

@auto_using DAGMakie CairoMakie CausalDynamics Graphs

# Example: Decision-making about outcome Y
# Graph: Z → X → Y ← W, with Z → W
# Target: Y (outcome we care about)

g = SimpleDiGraph(4)
add_edge!(g, 1, 2)  # Z → X
add_edge!(g, 2, 3)  # X → Y
add_edge!(g, 4, 3)  # W → Y
add_edge!(g, 1, 4)  # Z → W

# Step 1: Compute Markov boundary of Y (target variable)
function markov_boundary(g::AbstractGraph, node::Integer)
    """
    Compute the Markov boundary of a node.
    
    Returns the minimal set of nodes needed for causal reasoning about the target node.
    """
    # Pre-allocate for type stability
    parents = Vector{Int}()
    for src in 1:nv(g)
        if has_edge(g, src, node)
            push!(parents, src)
        end
    end
    
    children = Vector{Int}()
    for dst in 1:nv(g)
        if has_edge(g, node, dst)
            push!(children, dst)
        end
    end
    
    spouses = Set{Int}()
    for child in children
        for parent in 1:nv(g)
            if has_edge(g, parent, child) && parent != node
                push!(spouses, parent)
            end
        end
    end
    return sort(unique([parents; children; collect(spouses)]))
end

mb_Y = markov_boundary(g, 3)  # Markov boundary of Y (node 3)
println("Markov boundary of Y: ", mb_Y)  # Should be [2, 4] = {X, W}

# Step 2: Identify Markov blanket components
# Internal states (X): Latent process states
# Sensory states (Y): Observations
# Active states (A): Actions/interventions (if we add them)
# External states: Everything else

internal_states = [2]  # X (latent)
sensory_states = [3]  # Y (observed)
# Active states would be added if we have interventions

markov_blanket = sort(unique([internal_states; sensory_states]))
println("Markov blanket: ", markov_blanket)

# Step 3: Combine for CDM scope
cdm_scope = sort(unique([mb_Y; markov_blanket]))
println("CDM scope (union): ", cdm_scope)
println("  Includes: Markov boundary of Y + Markov blanket components")

# Visualise
let
    node_labels = ["Z", "X", "Y", "W"]
    node_colors = [:lightgray, :yellow, :lightgreen, :yellow]  # Highlight included nodes
    
    fig, ax, p = dagplot(g;
        figure_size = (800, 600),
        layout_mode = :acyclic,
        node_color = node_colors,
        nlabels = node_labels
    )
    
    fig  # Only this gets displayed
end

Markov boundary of Y: [2, 4]
Markov blanket: [2, 3]
CDM scope (union): [2, 3, 4]
  Includes: Markov boundary of Y + Markov blanket components

Determining CDM scope using Markov boundary and Markov blanket

29.5 Policies as Interventions

29.5.1 Definition

A policy (dynamic treatment strategy) is a function that maps history to action:

\[ A_t = \pi(H_t) \]

where \(H_t = (Y_{1:t}, A_{1:t-1}, L_{1:t})\) is the observed history.

29.5.2 Policy Intervention

Under policy \(\pi\), we set: \[ do(\pi): \quad A_t = \pi(H_t) \]

Interpretation: Structural intervention that sets treatment according to policy rule.

29.6 Causal Decision-Making Framework

29.6.1 The Decision Problem

Question: Which action should we take to maximize expected outcome?

Framework: 1. Define objective: What do we want to maximize/minimize? 2. Model interventions: How do actions affect outcomes? 3. Optimize: Choose action that maximizes expected outcome

29.6.2 Optimal Policy

The optimal policy maximises expected outcome:

\[ \pi^* = \arg\max_\pi \mathbb{E}^{do(\pi)}[Y_T] \]

Methods (Murphy 2003; Schulam and Saria 2017): - Dynamic programming: Value iteration, policy iteration (Bellman 1957; Puterman 2014) - Reinforcement learning: Q-learning, policy gradient (Sutton and Barto 2018; Bertsekas 2019) - Causal methods: Use CDM to evaluate policies

29.6.3 Uncertainty in Decision-Making

Decision-making must account for: - Model uncertainty: Which model is correct? - Parameter uncertainty: What are the parameter values? - Structural uncertainty: What is the causal structure? - Process uncertainty: Intrinsic stochasticity - Observation uncertainty: Measurement error

Robust decisions: Choose actions that perform well across model/parameter uncertainty.

Robust policy: Find policy that performs well under uncertainty:

\[ \pi^* = \arg\max_\pi \min_{\theta \in \Theta} \mathbb{E}^{do(\pi)}_\theta[Y_T] \]

Interpretation: Policy that works well even in worst-case scenario.

29.6.4 Active Inference and the Free Energy Perspective

So far we have treated decision-making in terms of policies that maximise expected outcomes under explicit CDMs. The Free Energy Principle (FEP) and active inference offer a complementary view in which agents are modelled as possessing internal generative models of their environment and acting to minimise expected surprise (variational free energy) under those models (Friston et al. 2006; Friston 2010, 2013). In this framing, policies are chosen not only to optimise a terminal outcome \(Y_T\), but to maintain the agent within a preferred region of state space—a homeostatic or viable set that defines what it is for the agent (or system) to persist.

Markov blankets provide the structural interface between agents and environments: internal states encode beliefs (and preferences) about external states, sensory states report what is happening at the boundary, and active states change the environment so that future sensory inputs remain within expected bounds. This can be read as a formalisation of goal-directed control under uncertainty: an agent acts to keep future observations within a preferred range (a viable region of state space), given its generative model (Whitehead 1929).

We will not develop active inference in full generality here, but it is worth noting that many of the objects we have already introduced—policies \(\pi\), Markov blankets, CDMs, and robustness notions—have natural counterparts in the FEP literature. In particular, robust policies can be interpreted as those that keep the system within a low-free-energy region of its state space despite uncertainty and perturbations, maintaining both causal coherence and functional viability.

29.6.5 Minimal Active Inference Example in Julia (Sketch)

To illustrate how these ideas can be implemented, we sketch a minimal active inference setup in Julia using a discrete-time, discrete-state model. The goal is for an agent to keep a scalar hidden state \(x_t\) near a preferred value (homeostasis) by choosing actions \(a_t \in \{-1, 0, +1\}\) that adjust \(x_t\). We use a simple tabular generative model and an approximate free-energy-based policy evaluation scheme.

# Find project root and include ensure_packages.jl
project_root = let
    current = pwd()
    while !isfile(joinpath(current, "Project.toml")) && !isfile(joinpath(current, "_quarto.yml"))
        parent = dirname(current)
        parent == current && break
        current = parent
    end
    current
end
include(joinpath(project_root, "scripts", "ensure_packages.jl"))
@auto_using Distributions CairoMakie

# Activate SVG output for responsive figures

# Discrete hidden states x ∈ {1,2,3,4,5}, observations y = x + noise, actions a ∈ {-1,0,+1}
states = 1:5
actions = [-1, 0, 1]
observ_noise = Normal(0.0, 0.3)

# Preferred state distribution: prefer x=3
prior_pref = pdf.(Distributions.Categorical([0.05, 0.1, 0.7, 0.1, 0.05]), states)

# Transition model p(x_{t+1} | x_t, a_t)
function transition_probs(x, a)
    x_next = clamp(x + a, first(states), last(states))
    probs = zeros(length(states))
    probs[x_next] = 1.0  # deterministic for simplicity
    return probs
end

# Likelihood p(y_t | x_t)
function likelihood_y_given_x(y, x)
    pdf(observ_noise, y - x)
end

# One-step expected free energy for action a from belief q(x_t)
function expected_free_energy(q_x, a;
        n_samples = 21, y_min = 0.5, y_max = 5.5)
    # Predictive distribution over x_{t+1}
    q_x_next = zeros(length(states))
    for (i, x) in enumerate(states)
        trans = transition_probs(x, a)
        q_x_next .+= q_x[i] .* trans
    end

    # Approximate predictive distribution over observations y_{t+1}
    ys = range(y_min, y_max; length = n_samples)
    q_y = zeros(length(ys))
    for (j, y) in enumerate(ys)
        for (i, x) in enumerate(states)
            q_y[j] += q_x_next[i] * likelihood_y_given_x(y, x)
        end
    end
    q_y ./= sum(q_y)  # normalise

    # Expected "risk" term: divergence from preferred states (via observations as proxy)
    # Here we use a simple surrogate: map y back to nearest state and compare to prior_pref
    risk = 0.0
    for (j, y) in enumerate(ys)
        x_idx = clamp(round(Int, y), first(states), last(states))
        p_pref = prior_pref[x_idx]
        if q_y[j] > 0 && p_pref > 0
            risk += q_y[j] * log(q_y[j] / p_pref)
        end
    end

    return risk
end

# Simulate active inference control
T = 30
x_true = zeros(Int, T)
y_obs = zeros(Float64, T)
q_hist = Vector{Vector{Float64}}(undef, T)
a_hist = zeros(Int, T)

# Initial true state and belief
x_true[1] = 2
q_x = fill(1.0 / length(states), length(states))  # initial uniform belief

for t in 1:T
    # Observe y_t
    if t == 1
        y_obs[t] = x_true[t] + rand(observ_noise)
    else
        y_obs[t] = x_true[t] + rand(observ_noise)
    end

    # Bayesian update of q(x_t | y_{1:t}) via simple likelihood multiplication
    for (i, x) in enumerate(states)
        q_x[i] *= likelihood_y_given_x(y_obs[t], x)
    end
    q_x ./= sum(q_x)
    q_hist[t] = copy(q_x)

    # Evaluate candidate actions by approximate expected free energy
    g_vals = [expected_free_energy(q_x, a) for a in actions]
    # Choose action that minimises expected free energy
    a_idx = argmin(g_vals)
    a_t = actions[a_idx]
    a_hist[t] = a_t

    # Evolve true state
    if t < T
        x_next_probs = transition_probs(x_true[t], a_t)
        x_true[t + 1] = rand(Distributions.Categorical(x_next_probs))
    end
end

# Visualise true state, observations, and actions
let
    fig = Figure(size = (900, 500))
    ax1 = Axis(fig[1, 1],
        xlabel = "Time",
        ylabel = "State / observation",
        title = "Active-inference-style control towards preferred state"
    )

    lines!(ax1, 1:T, x_true,
        label = "True state x_t",
        color = :blue,
        linewidth = 2
    )
    scatter!(ax1, 1:T, y_obs,
        label = "Observations y_t",
        color = :black,
        markersize = 4,
        alpha = 0.6
    )
    hlines!(ax1, [3.0],
        color = :green,
        linestyle = :dash,
        label = "Preferred state"
    )

    ax2 = Axis(fig[2, 1],
        xlabel = "Time",
        ylabel = "Action a_t",
        title = "Selected actions"
    )
    stem!(ax2, 1:T, a_hist,
        color = :red,
        markersize = 6
    )

    axislegend(ax1, position = :rb)
    fig  # Only this gets displayed
end

Minimal active-inference-style control: maintaining a latent state near a preferred value

This toy example is not a full implementation of ActiveInference.jl or RxInfer.jl, but it shows the core ideas: a generative model over hidden states and observations, a preference distribution, approximate Bayesian updating of beliefs, and action selection via an expected-free-energy-like objective that keeps the system near preferred states. More sophisticated models can be built using ActiveInference.jl, RxInfer.jl, and related packages for realistic CDMs.

29.6.6 Implementation: Decision-Making Under Uncertainty

Here’s an example of making robust decisions under parameter uncertainty:

# Find project root and include ensure_packages.jl
project_root = let
    current = pwd()
    while !isfile(joinpath(current, "Project.toml")) && !isfile(joinpath(current, "_quarto.yml"))
        parent = dirname(current)
        parent == current && break
        current = parent
    end
    current
end
include(joinpath(project_root, "scripts", "ensure_packages.jl"))

@auto_using Random Distributions DifferentialEquations CairoMakie

Random.seed!(42)

# Example: Treatment decision under parameter uncertainty
# We're uncertain about treatment effectiveness α ∈ [0.2, 0.4]

β = 0.2  # Disease progression (known)
α_values = [0.2, 0.3, 0.4]  # Uncertain treatment effectiveness

# Policies to consider
policy1 = (X, t) -> X > 0.6 ? 1.0 : 0.0  # Conservative
policy2 = (X, t) -> X > 0.4 ? 1.0 : 0.0  # Moderate
policy3 = (X, t) -> 1.0  # Aggressive

function disease_uncertain!(du, u, p, t)
    """Disease model with policy and uncertain treatment effect α."""
    X = u[1]
    policy, α = p
    A = policy(X, t)
    du[1] = β * X - α * A * X
end

u0 = [0.3]
tspan = (0.0, 20.0)

# Evaluate each policy under each parameter value
policies = [("Conservative", policy1), ("Moderate", policy2), ("Aggressive", policy3)]
results = Dict()

for (name, policy) in policies
    outcomes = Float64[]
    for α in α_values
        prob = ODEProblem(disease_uncertain!, u0, tspan, (policy, α))  # Using tuple for better performance
        sol = solve(prob, Tsit5())
        push!(outcomes, sol.u[end][1])
    end
    results[name] = outcomes
end

# Robust decision: minimise worst-case outcome
worst_case = Dict(name => maximum(outcomes) for (name, outcomes) in results)
robust_policy = first(keys(worst_case))  # Initialize
for (name, worst) in worst_case
    if worst < worst_case[robust_policy]
        global robust_policy = name
    end
end

# Visualise
let
fig = Figure(size = (1200, 400))
ax1 = Axis(fig[1, 1], title = "Outcomes Under Parameter Uncertainty", 
           xlabel = "Policy", ylabel = "Final Severity")
ax2 = Axis(fig[1, 2], title = "Worst-Case Outcomes (Robust Decision)", 
           xlabel = "Policy", ylabel = "Worst-Case Severity")

colors = [:blue, :green, :red]
for (i, (name, outcomes)) in enumerate(results)
    # Show range with error bars (boxplot not available, use scatter with range)
    scatter!(ax1, fill(i, length(outcomes)), outcomes, color = colors[i], alpha = 0.5, label = name)
    # Show mean and range
    mean_outcome = mean(outcomes)
    min_outcome = minimum(outcomes)
    max_outcome = maximum(outcomes)
    scatter!(ax1, [i], [mean_outcome], color = colors[i], markersize = 10, marker = :rect)
    errorbars!(ax1, [i], [mean_outcome], [mean_outcome - min_outcome], [max_outcome - mean_outcome], 
               color = colors[i], linewidth = 2)
    
    # Worst case
    barplot!(ax2, [i], [worst_case[name]], color = colors[i], label = name)
end

axislegend(ax1)
axislegend(ax2)

    fig  # Only this gets displayed
end

println("Decision-making under uncertainty:")
for (name, outcomes) in results
    println("  ", name, ": outcomes = ", [round(o, digits=3) for o in outcomes])
    println("    worst-case = ", round(worst_case[name], digits=3))
end
println("\nRobust policy (minimises worst-case): ", robust_policy)

Precompiling packages...
   1637.1 ms  ✓ LazyArrays → LazyArraysBlockArraysExt
  1 dependency successfully precompiled in 2 seconds. 20 already precompiled.
Decision-making under uncertainty:
  Moderate: outcomes = [0.404, 0.402, 0.4]
    worst-case = 0.404
  Aggressive: outcomes = [0.3, 0.041, 0.005]
    worst-case = 0.3
  Conservative: outcomes = [0.61, 0.603, 0.6]
    worst-case = 0.61

Robust policy (minimises worst-case): Aggressive

29.7 World Context

This chapter addresses Doing in the Observable world—how can we make decisions using causal models? Causal decision-making uses observable data and causal models to choose actions that maximize expected outcomes, bridging the Observable world (what we observe) with the Structural world (what would happen under interventions).

29.8 Key Takeaways

Markov boundary: Minimal set for causal reasoning about target variables
Markov blanket: Principled system boundary from FEP perspective
Unified principle: Combine both to determine CDM scope
Causal decision-making: Choose actions to maximize expected outcomes
Uncertainty matters: Robust decisions account for model/parameter uncertainty

29.9 Further Reading

Pearl (2009): Causality — Decision-making
Murphy (2003): “Optimal dynamic treatment regimes”
Interventional Reasoning: Forecasting Under Interventions: Forecasting under interventions
Policy Evaluation and Dynamic Treatment Strategies: Evaluating policies
Counterfactual Reasoning: Unit-Level Alternatives: Counterfactual decision-making