37  Software Engineering Patterns for Causal-Dynamical Modelling

Status: Draft

v0.2

37.1 Learning Objectives

After reading this chapter, you will be able to:

  • Structure packages for causal-dynamical modelling
  • Design type systems for composability and performance
  • Write effective tests for probabilistic and causal code
  • Apply performance best practices
  • Create reusable components for CDMs

37.2 Introduction

Good software engineering is essential for reproducible, maintainable, and reusable scientific code. This chapter recommends package structure, testing strategies, and performance practices for causal-dynamical modelling, with a Julia focus but generalisable principles.

37.3 Package Structure

37.3.2 Principles

  • Modularity: Separate concerns (process, observation, inference, causal)
  • Composability: Components can be combined
  • Extensibility: Easy to add new models/methods
  • Documentation: Clear docstrings and examples

37.4 Type System Design

37.4.1 CDM Type

abstract type AbstractCDM end

struct CDM{F, H, P, O} <: AbstractCDM
    process_model::F      # Process dynamics
    observation_model::H  # Observation model
    parameters::P         # Parameters
    exogenous::O          # Exogenous noise structure
end

37.4.2 Benefits

  • Type stability: Enables compiler optimisations
  • Multiple dispatch: Different methods for different types
  • Composability: Can combine different components

37.5 Testing Strategies

37.5.1 Unit Tests

Test individual components in isolation:

@testset "Process Model" begin
    model = LotkaVolterra(r=1.0, K=100.0, Ξ±=0.1, Ξ²=0.02, Ξ΄=0.5)
    x0 = [50.0, 10.0]
    x1 = step(model, x0, 0.0, 0.1)
    @test length(x1) == 2
    @test all(x1 .> 0)
end

37.5.2 Integration Tests

Test components working together:

@testset "CDM Inference" begin
    cssm = create_ecosystem_cssm()
    data = simulate(cssm, T=100)
    posterior = fit(cssm, data)
    @test hasfield(posterior, :parameters)
end

37.5.3 Probabilistic Tests

For probabilistic code, test distributions:

@testset "Observation Model" begin
    model = PoissonObservation(Ξ»=10.0)
    samples = [rand(model, x) for _ in 1:10000]
    @test mean(samples) β‰ˆ 10.0 atol=0.5
    @test var(samples) β‰ˆ 10.0 atol=1.0
end

37.5.4 Causal Tests

Test causal methods on synthetic data with known truth:

@testset "Intervention" begin
    cssm = create_synthetic_cssm()
    data_obs = simulate(cssm, policy=natural_policy)
    effect_true = true_intervention_effect(cssm)
    effect_est = estimate_intervention(data_obs, do(A=1))
    @test abs(effect_est - effect_true) < 0.1
end

37.6 Performance Practices

37.6.1 Use Type Stability

Ensure functions return consistent types:

# Good: Type stable
function step(model::LotkaVolterra, x, a, dt)
    return SVector{2, Float64}(f1(x, a), f2(x, a))
end

# Bad: Type unstable
function step(model, x, a, dt)
    return [f1(x, a), f2(x, a)]  # Returns Vector, type unstable
end

37.6.2 Pre-allocate Arrays

Avoid allocations in hot loops:

# Good: Pre-allocate
function simulate(cssm, T)
    X = Matrix{Float64}(undef, d, T)
    for t in 1:T
        X[:, t] = step(cssm, X[:, t-1], ...)
    end
    return X
end

37.6.3 Use Specialised Packages

Leverage specialised packages for performance:

  • DifferentialEquations.jl: ODE/SDE solving
  • StateSpaceModels.jl: State-space inference
  • ForwardDiff.jl: Automatic differentiation

37.7 Reusable Components

37.7.1 Observation Models

Create reusable observation model types:

abstract type AbstractObservationModel end

struct PoissonObservation <: AbstractObservationModel
    Ξ»::Function  # Intensity function
end

function observe(model::PoissonObservation, x)
    return rand(Poisson(model.Ξ»(x)))
end

37.7.2 Intervention Operators

Create reusable intervention operators:

struct DoIntervention{T}
    variable::Symbol
    value::T
end

function apply(cssm::CDM, intervention::DoIntervention)
    # Modify CDM according to intervention
    return modified_cssm
end

37.7.3 Diagnostics

Create reusable diagnostic functions:

function posterior_predictive_check(model, data, n_samples=1000)
    y_rep = [simulate(model) for _ in 1:n_samples]
    return compare(y_rep, data)
end

37.8 Documentation

37.8.1 Docstrings

Use Julia’s docstring syntax:

"""
    CDM(process_model, observation_model, parameters, exogenous)

Create a Causal Dynamical Model.

# Arguments

- `process_model`: Process dynamics function
- `observation_model`: Observation model
- `parameters`: Model parameters
- `exogenous`: Exogenous noise structure

# Examples

```julia
cssm = CDM(lotka_volterra, poisson_obs, ΞΈ, noise)

β€œβ€œβ€ struct CDM … end


## Version Control and Reproducibility

### Project.toml

Pin dependencies for reproducibility:

```toml
[compat]
julia = "1.12"
DifferentialEquations = "7.8"
StateSpaceModels = "0.6"

37.8.2 Documentation

Document:

  • Dependencies: What packages are needed?
  • Setup: How to install and set up?
  • Usage: How to use the package?
  • Examples: Working examples

37.9 Key Takeaways

  1. Modular package structure separates concerns and enables composability
  2. Type system design enables performance and extensibility
  3. Comprehensive testing (unit, integration, probabilistic, causal) ensures correctness
  4. Performance practices (type stability, pre-allocation, specialised packages) enable efficiency
  5. Reusable components reduce code duplication and improve maintainability
  6. Good documentation and version control ensure reproducibility

37.10 Further Reading

  • Julia documentation: Performance tips, package development
  • Julia package development guide
  • Best practices for scientific computing