Simulation Backends
Simulation Backends
compartmental ships four simulation backends for the same IR model: Gillespie (exact SSA), tau-leap, chain-binomial, and ODE (RK4). They all take the same compiled model and parameter vector and return the same Trajectory type. The choice is a tradeoff between fidelity, speed, and what the downstream analysis requires.
camdl simulate model.ir.json --params base.toml --backend gillespie
camdl simulate model.ir.json --params base.toml --backend tau_leap --dt 1
camdl simulate model.ir.json --params base.toml --backend chain_binomial --dt 1
camdl simulate model.ir.json --params base.toml --backend ode --dt 0.1
Gillespie (exact SSA)
When to use: small-to-medium populations where stochastic extinction, outbreak probability, or the exact distribution of event times matters. The gold standard for correctness.
Algorithm
The Gillespie Stochastic Simulation Algorithm (Gillespie 1977) is exact for continuous-time Markov chains. Given state X and propensities λ₁, …, λ_M:
- Compute the total propensity: Λ = Σᵢ λᵢ
- Draw the waiting time to the next event: τ ~ Exp(Λ), i.e. τ = −ln(U₁)/Λ where U₁ ~ Uniform(0,1)
- Select which event fires: choose transition j with probability λⱼ/Λ
- Apply the stoichiometry: X ← X + νⱼ
- Update propensities and goto 1
This is exact in the sense that the joint distribution of (τ, j) is exactly the correct first-passage distribution of the CTMC.
Implementation details (gillespie.rs)
Incremental propensity updates. After transition j fires and changes compartments according to νⱼ, only transitions whose rate expressions depend on those compartments need recomputation. The comp_to_transitions map (built at model compile time) gives the set of affected transitions for each compartment. This makes each event O(k) where k is the mean number of transitions sharing a compartment, rather than O(M).
Full recompute every 10 000 events. Incremental floating-point additions to lambda_total accumulate roundoff error. A periodic full recompute (sum of all propensities from scratch) prevents drift. The interval is tunable via the FULL_RECOMPUTE_INTERVAL constant.
Time-dependent transitions. Transitions whose rate expression contains a TimeFunc (seasonal forcing, piecewise schedules) cannot be maintained by the compartment-change dependency graph alone. They are tracked separately in time_dep_transitions and re-evaluated whenever simulation time advances to a boundary (output time, intervention time), even if no integer state changed.
Absorbing states. When Λ = 0 (no events possible), the simulation fast- forwards to the next intervention or output time. After applying an intervention, propensities are recomputed — the system may become active again (e.g., a vaccination campaign seeds a new outbreak).
Real compartments (PDMP). If the model contains real compartments with ODE equations, Gillespie operates as a Piecewise-Deterministic Markov Process (PDMP). Between discrete events, the real state advances continuously by RK4. This is an approximation for v0.1 — a correct PDMP implementation would use thinning to account for propensity changes driven by the real state during the interevent interval. The v0.1 implementation advances real state in one RK4 step at each event, which is accurate when real-state dynamics are slow relative to the event rate.
Complexity and scaling
- Time per event: O(k) amortized (incremental update) where k ≪ M for sparse dependency graphs (typical epidemic models).
- Total events: proportional to Λ × T. For a 10 000-person SIR at peak, Λ ≈ 3000 events/day; a 100-day simulation ≈ 300 000 events per run.
- Scales poorly with N: events ∝ N, so 1M-person runs are often impractical. Use tau-leap or ODE for large populations.
Tau-leap
When to use: large populations where individual-event resolution is unnecessary. ~10–100× faster than Gillespie; introduces Poisson approximation error that shrinks as dt → 0.
Algorithm
Gillespie’s tau-leap approximation (Gillespie 2001): in a step of length τ, assume propensities are approximately constant. Then the number of times transition j fires is:
Δnⱼ ~ Poisson(λⱼ · τ)
State update:
X(t + τ) = X(t) + Σⱼ νⱼ · Δnⱼ
Negative compartments (which Poisson draws can produce) are clamped to zero.
Implementation details (tau_leap.rs)
Step truncation. The nominal dt is truncated when an output time or intervention time falls within the current step. The simulation always hits boundaries exactly.
Ordering. All transition draws are made from the state at the start of the step, then applied simultaneously. This is the basic (non-adaptive) tau-leap scheme. Adaptive schemes (Cao et al. 2006, Xu & Cai 2011) that automatically choose τ are not implemented; the user supplies --dt.
Clamping. After applying all stoichiometry changes, any negative counts are zeroed. A warning is logged. Frequent clamping indicates dt is too large.
Real compartments. Advanced with RK4 using the integer state at the end of the tau-leap step (post-clamping). This slight ordering asymmetry (integers first, real second) is a minor approximation.
Error analysis
The tau-leap approximation error is O(τ). The dominant term is the variance introduced by using constant-propensity Poisson draws over the interval; the exact distribution would use time-varying propensities. In practice, tau-leap is accurate when τ is small enough that no single compartment changes by more than ~1% per step.
A rule of thumb: τ ≤ 1/(10 · max(λⱼ/n_source)), where n_source is the population in the source compartment. For --dt 1 'days with β≈0.3, this is satisfied when N > ~100.
Chain-binomial
When to use: discrete-time models where the generation interval is the natural time step (e.g., daily surveillance data, weekly incidence). Respects integer constraints better than tau-leap; fewer clamping issues.
Algorithm
Reed-Frost chain-binomial model (Abbey 1952): in a discrete time step Δt, each susceptible escapes infection independently with probability exp(−λ · Δt) (survival probability under a constant-rate Poisson process). Therefore the number who become infected is:
Δn_infection ~ Binomial(S, 1 − exp(−λ · Δt))
More generally, for any transition with rate λ and source population n_src:
p = 1 − exp(−λ · Δt)
Δnⱼ ~ Binomial(n_src, p)
This is exact if λ is truly constant over the interval and events are independent (no competition between transitions for the same source).
Implementation details (chain_binomial.rs)
Multinomial competing risks. When multiple transitions draw from the same source compartment (e.g., infection and death both depleting S), the chain-binomial uses a multinomial draw — not independent binomials. This matches pomp’s reulermultinom semantics: the total number leaving a compartment is bounded by the compartment size, and the split between competing outflows is proportional to their rates.
Transitions are precomputed into source groups at model compilation time (CompiledModel::source_groups). For a source compartment with k competing outflows at per-capita probabilities p₁, …, p_k:
- Compute per-capita rate for each outflow:
r_i = propensity_i / n_src - Convert to probability:
p_i = 1 − exp(−r_i · Δt) - Draw sequentially (conditional binomial decomposition):
count_1 ~ Binom(n_remaining, p_1 / (1 − 0))count_2 ~ Binom(n_remaining − count_1, p_2 / (1 − p_1))count_3 ~ Binom(n_remaining − count_1 − count_2, p_3 / (1 − p_1 − p_2))
- Guarantee:
Σ count_i ≤ n_src(no overdraw)
This sequential decomposition is exact for the multinomial distribution. For source groups of size 1 (single outflow), it reduces to a standard binomial.
Poisson approximation for draws. Individual draws use Poisson(n · p) capped at n, which approximates Binomial(n, p) for large n. This is adequate for epidemic models where compartments typically hold hundreds to millions of individuals.
Inflows. Transitions with no source compartment (births, importation) are not part of any source group. They draw from the total propensity directly via Poisson(rate · dt).
Per-capita rate conversion. The IR stores total propensities (e.g., mu × S). The chain-binomial divides by n_src to recover the per-capita rate before converting to probability. This is critical: using the total propensity directly would give p = 1 − exp(−mu·S·dt) ≈ 1.0 for large compartments, killing the entire population in one step.
Overdispersion. When a transition has overdispersed(rate, σ²), the Gamma multiplier is applied to the per-capita rate before probability conversion: effective_rate = per_capita × G where G ~ Gamma(dt/σ², σ²/dt).
Real compartments. Advanced with RK4 before the multinomial draws (using the start-of-step integer state). The ordering difference from tau-leap is intentional: for chain-binomial, the continuous dynamics represent processes that run in parallel with (rather than after) the discrete transitions.
Relationship to tau-leap
Chain-binomial and tau-leap agree in the limit of large populations and small p: Binomial(n, p) ≈ Poisson(n·p). The key differences:
Multinomial vs independent. Chain-binomial draws competing transitions from a shared source as a multinomial (bounded). Tau-leap draws them independently as Poisson (can overdraw, requires clamping).
No clamping needed. The multinomial guarantees
Σ count_i ≤ n_srcby construction. Tau-leap needs post-step clamping to zero.Matches Euler-multinomial. The chain-binomial is equivalent to pomp’s
reulermultinomwhen using the same per-capita probabilities — making it the appropriate backend for validating against pomp implementations.
ODE (RK4)
When to use: large populations where stochasticity is negligible, or for fast deterministic exploration of parameter space before running stochastic ensembles. Agrees with Gillespie/tau-leap in expectation (same rate expressions drive both).
Algorithm
Fourth-order Runge-Kutta integration of the system:
dXᵢ/dt = Σⱼ νᵢⱼ · λⱼ(X, θ, t)
where νᵢⱼ is the stoichiometry (±1 or 0) of compartment i in transition j, and λⱼ is the transition rate evaluated at state X, parameters θ, and time t.
For each RK4 step from t to t + h:
k₁ = f(X(t), t)
k₂ = f(X + h/2·k₁, t + h/2)
k₃ = f(X + h/2·k₂, t + h/2)
k₄ = f(X + h·k₃, t + h)
X(t + h) = X(t) + h/6 · (k₁ + 2k₂ + 2k₃ + k₄)
Global truncation error is O(h⁴).
Implementation details (ode.rs)
Unified float state. Integer compartments are promoted to f64 at the start of the ODE run and remain continuous throughout. The ODE system has dimension = n_int + n_real (all compartments).
Derivative sources.
- From transitions (auto-derived): for each transition j, the rate λⱼ is evaluated at the current state; the contribution to compartment i’s derivative is
νᵢⱼ · λⱼ. - From explicit ODE equations: compartments declared
realhave their derivative specified directly in theode {}block and evaluated verbatim.
Propensity evaluation during RK4 substeps. At each k₁–k₄ evaluation, integer-compartment floats are rounded to i64 to construct the IntState needed by eval_expr. This introduces a rounding error of O(1) in the compartment value at each substep evaluation, which means propensities have relative error O(1/N). For populations N > 100 this is negligible. For very small compartment values (< ~10) the rounding can cause premature extinction — this is the deterministic approximation’s inherent limitation.
Non-negativity. After each RK4 step, all state values are clamped to ≥ 0 to suppress floating-point undershoots. This is a conservative guard; a properly tuned h should not require it in practice.
Interventions. At each intervention time, the float state is rounded into IntState/RealState, the intervention action is applied (which may fractionally transfer individuals), and the result is converted back to float.
Flows. Cumulative flows are accumulated as f64 (rate × dt per step) and rounded to u64 at each output snapshot, matching the integer type used by stochastic backends.
Seed parameter. The --seed CLI argument is accepted but ignored; ODE runs are fully deterministic.
When ODEs and SSA disagree
The ODE is the mean-field limit (N → ∞) of the CTMC. They disagree in three important ways:
Extinction. The ODE never reaches zero — it asymptotically approaches zero. The CTMC can and does hit the absorbing state I=0. If outbreak probability or time-to-extinction is the question, use Gillespie.
Stochastic amplification. Near critical points (R₀ ≈ 1), random fluctuations can drive large outbreaks that the ODE’s deterministic trajectory misses. The ODE underestimates outbreak probability in this regime.
Jensen’s inequality effects. Nonlinear rate expressions (e.g., βSI/N) evaluated at the mean state differ from the mean of the expression evaluated at random states. For compartmental models this is typically a small correction, but it can matter for highly variable forcing functions.
Comparison table
| Property | Gillespie | Tau-leap | Chain-binomial | ODE |
|---|---|---|---|---|
| Time type | continuous | discrete | discrete | continuous |
| Stochastic | yes | yes | yes | no |
| Exact | yes | approx | approx | approx |
| Extinction behavior | correct | clamped | correct | never |
| Step size | event-driven | user (–dt) | user (–dt) | user (–dt) |
| Speed (large N) | slow | fast | fast | fast |
| Speed (small N) | fast | overhead | overhead | fast |
| Real compartments | PDMP (approx) | hybrid | hybrid | native |
| CRN coupling | yes | yes | yes | n/a |
| Overdispersion | incompatible | supported | supported | incompatible |
| Int rounding during RK4 | n/a | n/a | n/a | yes (O(1/N) error) |
Rule of thumb: use Gillespie for N < 10 000 and when extinction matters; tau-leap or chain-binomial for N > 10 000 in production stochastic runs (chain-binomial if the generation interval aligns with your Δt); ODE for fast parameter sweeps or very large spatial models. If the model uses overdispersed() transitions, Gillespie and ODE are rejected at runtime — use tau-leap or chain-binomial.
Extra-demographic stochasticity (overdispersion)
Transitions wrapped in overdispersed(rate, σ²) receive Gamma-distributed multiplicative noise on their rate (He et al. 2010). The Poisson-Gamma compound produces NegBinomial event counts with mean = rate × Δt and variance inflated by σ²_SE.
Tau-leap implementation. For each overdispersed transition per step:
- Evaluate propensity λ and overdispersion σ² from the current state
- Draw a Gamma-distributed rate: ε ~ Gamma(shape = λΔt/σ², scale = σ²)
- Draw events: ΔN ~ Poisson(ε)
This is equivalent to ΔN ~ NegBinomial(mean = λΔt, size = λΔt/σ²). When σ² → 0, the Gamma concentrates at its mean and the draw converges to Poisson(λΔt).
Chain-binomial implementation. Same Gamma-Poisson compound applied to the expected count n·p (where p = 1 − exp(−λΔt)), capped at the source population.
Backend capability system. Each model declares required capabilities (derived from the IR at compile time). Each backend declares what it supports. Mismatch produces a hard error before simulation starts — no silent wrong answers.
$ camdl-sim model.ir.json --backend gillespie --seed 42
error: model requires capabilities not supported by backend 'gillespie':
- OVERDISPERSION: transitions with overdispersion require --backend tau_leap or chain_binomial