23  From Dynamical to Observable: Measurement and Actualisation

Status: Draft

v0.4

23.1 Learning Objectives

After reading this chapter, you will be able to:

  • Understand how dynamic processes become observable
  • Define observation models that connect Dynamical to Observable
  • Recognise partial observability in dynamical systems
  • Prepare for working with observable data

23.2 Introduction

This chapter bridges Dynamical and Observable worlds, showing how dynamic processes (inner worlds) become actualised as observations (outer world) through measurement processes. This prepares for Part III where we work with observable data. This is the transition from Doing/Imagining in the Dynamical world to Seeing in the Observable world.

23.3 How Dynamic Processes Become Observable

23.3.1 The Observation Model

Dynamic processes exist in the Dynamical world (inner world), and become observable through the observation model:

\[ Y_t = h(X_t, C, U^y_t) \]

where: - \(X_t\) is the latent state (exists in Dynamical world) - \(Y_t\) is the observation (exists in Observable world) - \(h\) is the observation function (encodes how inner worlds become outer worlds) - \(U^y_t\) is measurement noise

23.3.2 The Actualisation Process

The observation model represents the process of actualisation: - Dynamical world (inner): Dynamic processes, latent states \(X_t\) - Observable world (outer): Actualised observations \(Y_t\) - Observation function \(h\): How inner worlds become outer worlds

23.3.3 Implementation: Observation Models

We can demonstrate different types of observation models:

# Find project root and include ensure_packages.jl
project_root = let
    current = pwd()
    while !isfile(joinpath(current, "Project.toml")) && !isfile(joinpath(current, "_quarto.yml"))
        parent = dirname(current)
        parent == current && break
        current = parent
    end
    current
end
include(joinpath(project_root, "scripts", "ensure_packages.jl"))

@auto_using Random Distributions DifferentialEquations CairoMakie

Random.seed!(42)

# Example: Latent state X_t evolving over time
function latent_dynamics!(du, u, p, t)
    """Latent state dynamics: exponential decay with rate 0.1."""
    X = u[1]
    du[1] = -0.1 * X  # Decay process
end

u0 = [1.0]
tspan = (0.0, 20.0)
prob = ODEProblem(latent_dynamics!, u0, tspan)
sol = solve(prob, Tsit5())

X_t = [u[1] for u in sol.u]  # Latent state

# Type 1: Direct measurement Y_t = X_t + noise
σ_direct = 0.1
Y_direct = X_t .+ rand(Normal(0, σ_direct), length(X_t))

# Type 2: Indirect measurement Y_t = h(X_t) + noise
# Example: h(X) = X² (nonlinear observation)
Y_indirect = X_t.^2 .+ rand(Normal(0, 0.05), length(X_t))

# Type 3: Aggregate measurement (if we had multiple states)
# Y_t = Σ X_i + noise (simplified here)
Y_aggregate = X_t .+ rand(Normal(0, 0.1), length(X_t))  # Simplified

# Visualise
let
fig = Figure(size = (1200, 400))
ax1 = Axis(fig[1, 1], title = "Direct Measurement: Y = X + noise", 
           xlabel = "Time", ylabel = "Value")
ax2 = Axis(fig[1, 2], title = "Indirect Measurement: Y = X² + noise", 
           xlabel = "Time", ylabel = "Value")
ax3 = Axis(fig[1, 3], title = "Latent vs Observed", 
           xlabel = "Time", ylabel = "Value")

lines!(ax1, sol.t, X_t, label = "Latent X_t", linewidth = 2, color = :blue)
scatter!(ax1, sol.t, Y_direct, label = "Observed Y_t", color = :red, alpha = 0.5, markersize = 3)
axislegend(ax1)

lines!(ax2, sol.t, X_t.^2, label = "h(X_t) = X²", linewidth = 2, color = :blue)
scatter!(ax2, sol.t, Y_indirect, label = "Observed Y_t", color = :red, alpha = 0.5, markersize = 3)
axislegend(ax2)

lines!(ax3, sol.t, X_t, label = "Latent X_t", linewidth = 2, color = :blue)
scatter!(ax3, sol.t, Y_direct, label = "Direct obs", color = :red, alpha = 0.5, markersize = 3)
scatter!(ax3, sol.t, sqrt.(Y_indirect), label = "Indirect obs (√Y)", color = :green, alpha = 0.5, markersize = 3)
axislegend(ax3)

    fig  # Only this gets displayed
end

println("Observation models:")
println("  Direct: Y_t = X_t + noise (σ = ", σ_direct, ")")
println("  Indirect: Y_t = X_t² + noise (nonlinear)")
println("  Inference: Must recover X_t from Y_t (addressed in state-space models)")
Observation models:
  Direct: Y_t = X_t + noise (σ = 0.1)
  Indirect: Y_t = X_t² + noise (nonlinear)
  Inference: Must recover X_t from Y_t (addressed in state-space models)

23.4 Partial Observability

23.4.1 The Problem

In most real systems, we cannot directly observe the latent state \(X_t\): - Partial observability: We only observe \(Y_t\), not \(X_t\) - Measurement noise: Observations are noisy - Missing data: Some observations may be missing

23.4.2 Inference Problem

Given observations \(Y_{1:T}\), we must infer the latent state \(X_{1:T}\):

\[ P(X_{1:T} \mid Y_{1:T}) \]

This is the inference problem addressed in State-Space Models: Inferring Structure from Observations.

23.4.3 Implementation: Partial Observability

We can demonstrate the inference problem when we only observe \(Y_t\), not \(X_t\):

# Find project root and include ensure_packages.jl
project_root = let
    current = pwd()
    while !isfile(joinpath(current, "Project.toml")) && !isfile(joinpath(current, "_quarto.yml"))
        parent = dirname(current)
        parent == current && break
        current = parent
    end
    current
end
include(joinpath(project_root, "scripts", "ensure_packages.jl"))

@auto_using Random Distributions CairoMakie

Random.seed!(42)

# Example: We observe Y_t but not X_t
# Latent state: X_t (unknown)
# Observation: Y_t = X_t + noise

# Simulate true latent state
T = 50
X_true = zeros(T)
X_true[1] = 1.0
for t in 2:T
    X_true[t] = 0.9 * X_true[t-1] + rand(Normal(0, 0.1))  # AR(1) process
end

# Observations with noise
σ_obs = 0.2
Y_obs = X_true .+ rand(Normal(0, σ_obs), T)

# Inference problem: Given Y_obs, infer X_true
# Simple approach: Use observations directly (ignoring dynamics)
X_inferred_naive = Y_obs

# Better approach: Use dynamics + observations (Kalman filter - simplified)
# Here we show a simple smoothing approach
X_inferred_smooth = zeros(T)
α = 0.5  # Smoothing parameter
for t in 1:T
    if t == 1
        X_inferred_smooth[t] = Y_obs[t]
    else
        # Combine observation with prediction from dynamics
        X_pred = 0.9 * X_inferred_smooth[t-1]
        X_inferred_smooth[t] = α * Y_obs[t] + (1 - α) * X_pred
    end
end

# Visualise
let
fig = Figure(size = (1000, 400))
ax = Axis(fig[1, 1], title = "Partial Observability: Inferring Latent States", 
          xlabel = "Time", ylabel = "Value")

lines!(ax, 1:T, X_true, label = "True latent X_t", linewidth = 2, color = :blue)
scatter!(ax, 1:T, Y_obs, label = "Observed Y_t", color = :red, alpha = 0.5, markersize = 4)
lines!(ax, 1:T, X_inferred_naive, label = "Naive inference (Y_t)", 
       linewidth = 1, color = :orange, linestyle = :dash)
lines!(ax, 1:T, X_inferred_smooth, label = "Smooth inference (uses dynamics)", 
       linewidth = 2, color = :green, linestyle = :dash)
axislegend(ax)

    fig  # Only this gets displayed
end

println("Partial observability:")
println("  True latent state: X_t (unknown)")
println("  Observations: Y_t = X_t + noise")
println("  Inference problem: Recover X_t from Y_t")
println("  Naive approach: Use Y_t directly (ignores dynamics)")
println("  Better approach: Combine observations with dynamics (state-space inference)")
Partial observability:
  True latent state: X_t (unknown)
  Observations: Y_t = X_t + noise
  Inference problem: Recover X_t from Y_t
  Naive approach: Use Y_t directly (ignores dynamics)
  Better approach: Combine observations with dynamics (state-space inference)

23.5 Measurement Processes

23.5.1 Types of Measurement

  • Direct measurement: \(Y_t = X_t + \text{noise}\) (observe state directly with noise)
  • Indirect measurement: \(Y_t = h(X_t) + \text{noise}\) (observe function of state)
  • Aggregate measurement: \(Y_t = \sum_i X^i_t + \text{noise}\) (observe sum of states)
  • Delayed measurement: \(Y_t = X_{t-k} + \text{noise}\) (observe past state)

23.5.2 Measurement Design

Question: What should we measure to maximize information about latent states?

Answer: Design measurements that maximize mutual information \(I(X_t; Y_t)\).

23.5.3 Implementation: Measurement Design

We can demonstrate how to choose measurements that maximize information:

# Find project root and include ensure_packages.jl
project_root = let
    current = pwd()
    while !isfile(joinpath(current, "Project.toml")) && !isfile(joinpath(current, "_quarto.yml"))
        parent = dirname(current)
        parent == current && break
        current = parent
    end
    current
end
include(joinpath(project_root, "scripts", "ensure_packages.jl"))

@auto_using Random Distributions CairoMakie

Random.seed!(42)

# Example: Two possible measurement functions
# Option 1: Y₁ = X + noise (direct, high information)
# Option 2: Y₂ = sign(X) + noise (binary, lower information)

# Simulate latent state
n = 1000
X = rand(Normal(0, 1), n)

# Measurement option 1: Direct (high information)
σ1 = 0.1
Y1 = X .+ rand(Normal(0, σ1), n)

# Measurement option 2: Binary (lower information)
Y2 = sign.(X) .+ rand(Normal(0, 0.1), n)

# Approximate mutual information I(X; Y) ≈ H(X) - H(X|Y)
# Using correlation as proxy (higher correlation → higher mutual information)
cor1 = abs(cor(X, Y1))
cor2 = abs(cor(X, Y2))

# Visualise
let
fig = Figure(size = (1000, 400))
ax1 = Axis(fig[1, 1], title = "Measurement 1: Y = X + noise (high info)", 
           xlabel = "X", ylabel = "Y₁")
ax2 = Axis(fig[1, 2], title = "Measurement 2: Y = sign(X) + noise (low info)", 
           xlabel = "X", ylabel = "Y₂")

scatter!(ax1, X, Y1, alpha = 0.3, color = :blue, markersize = 2)
scatter!(ax2, X, Y2, alpha = 0.3, color = :red, markersize = 2)

    fig  # Only this gets displayed
end

println("Measurement design comparison:")
println("  Option 1 (direct): Correlation = ", round(cor1, digits=3), " (high information)")
println("  Option 2 (binary): Correlation = ", round(cor2, digits=3), " (lower information)")
println("  → Option 1 provides more information about X")
println("  → Design measurements to maximize I(X; Y)")
Measurement design comparison:
  Option 1 (direct): Correlation = 0.995 (high information)
  Option 2 (binary): Correlation = 0.793 (lower information)
  → Option 1 provides more information about X
  → Design measurements to maximize I(X; Y)

23.6 Information-Theoretic Causal Measures

Beyond mutual information, information theory provides powerful tools for quantifying directed causal influence between time series. These measures capture how information flows from one process to another over time, complementing the observation and measurement frameworks discussed above.

23.6.1 Transfer Entropy

Transfer entropy measures directed information flow between time series, answering: “How much does the past of \(X\) help predict the future of \(Y\) beyond what \(Y\)’s own past already provides?”

The mathematical definition is:

\[ TE_{X \to Y} = \sum p(y_{t+1}, y_t^{(k)}, x_t^{(\ell)}) \log \frac{p(y_{t+1} \mid y_t^{(k)}, x_t^{(\ell)})}{p(y_{t+1} \mid y_t^{(k)})} \]

where \(y_t^{(k)}\) denotes the \(k\)-lag history of \(Y\) and \(x_t^{(\ell)}\) the \(\ell\)-lag history of \(X\). Introduced by Schreiber (2000), transfer entropy is:

  • Non-parametric: Does not assume linearity or Gaussianity
  • Asymmetric: \(TE_{X \to Y} \neq TE_{Y \to X}\) in general, capturing directionality
  • Conditional form: To control for confounders \(Z\), use conditional transfer entropy \(TE_{X \to Y \mid Z}\)

For Gaussian processes, transfer entropy relates to Granger causality: \(TE_{X \to Y}\) equals half the Granger \(F\)-statistic. Applications include neural information flow, gene regulatory interactions, and financial time series.

23.6.2 Directed Information

Directed information (Massey 1990) captures the total causal influence of \(X\) on \(Y\) over a time horizon:

\[ I(X^n \to Y^n) = \sum_{t=1}^n I(X^t; Y_t \mid Y^{t-1}) \]

It sums the incremental information that \(X^t\) provides about \(Y_t\) given \(Y\)’s past. Directed information connects to channel capacity and feedback systems, and is relevant for understanding communication in biological networks where information flows bidirectionally with delay.

23.6.3 Causal Entropy

Causal entropy quantifies the uncertainty in counterfactual distributions—the entropy of \(Y\) under interventions on \(X\):

\[ H_{\text{causal}}(Y \mid do(X)) = -\sum_x P(X=x) \sum_y P(Y=y \mid do(X=x)) \log P(Y=y \mid do(X=x)) \]

This measures the residual uncertainty in \(Y\) when we intervene on \(X\), contrasting with associational entropy \(H(Y \mid X)\) which conditions on observation. The difference captures how causal knowledge reduces uncertainty beyond correlation.

23.6.4 Practical Considerations

Estimation from finite data requires care:

  • Density estimation: Kernel density, \(k\)-nearest-neighbour, or binning for discrete approximations
  • Bias correction: The Kraskov–Stögbauer–Grassberger (KSG) estimator reduces bias in mutual-information-based quantities
  • Significance testing: Surrogate data methods (e.g. block permutation) to test against the null of no causal influence

The following conceptual code illustrates transfer entropy computation using a simple histogram-based estimator:

"""
Conceptual transfer entropy TE_{X→Y} computation.
Uses histogram-based probability estimates for illustration.
"""
function transfer_entropy_conceptual(x, y; k=2, ℓ=2)
    n = length(x)
    TE = 0.0
    for t in (max(k, ℓ)+1):(n-1)
        # Embeddings: y_t^(k), x_t^(ℓ)
        y_hist = y[t-k+1:t]
        x_hist = x[t-+1:t]
        y_next = y[t+1]
        # In practice: estimate p(y_{t+1}|y_t^k,x_t^ℓ) and p(y_{t+1}|y_t^k)
        # via histogram or k-NN; accumulate TE contribution
        # TE += p(y_next,y_hist,x_hist) * log(p(y_next|y_hist,x_hist) / p(y_next|y_hist))
    end
    return TE
end
# For production: use packages such as TransferEntropy.jl or CausalityTools.jl
Main.Notebook.transfer_entropy_conceptual

23.7 World Context

This chapter addresses the transition from Dynamical to Observable—how dynamic processes become actualised as observations. This prepares for Part III where we work with observable data to learn about causal mechanisms. The observation model \(Y_t = h(X_t, C, U^y_t)\) shows how inner worlds (Dynamical) become outer worlds (Observable).

23.8 Key Takeaways

  1. Observation model: How dynamic processes become observable
  2. Partial observability: We cannot directly observe latent states
  3. Inference problem: Must infer latent states from observations
  4. Measurement design: Choose measurements to maximise information
  5. Information-theoretic causal measures: Transfer entropy, directed information, and causal entropy quantify directed causal influence between time series
  6. Transition to Observable: Prepares for working with observable data

23.9 Further Reading