Dimensions and Stratification

An SIR model treats everyone as identical: one pool of susceptibles, one pool of infecteds, one recovery rate. Real populations are structured. Children and adults mix differently. Cities seed epidemics into rural areas. Vaccination history determines who is still at risk. The epidemiological questions almost always live in the structure — which age group drives transmission, which patch re-seeds after a campaign, whether partial immunity matters for dynamics.

camdl’s answer to this is dimensions: named axes of population heterogeneity that the compiler uses to expand a compact model specification into the full set of stratified compartments, transitions, and bookkeeping. You write the model once, declare which axes matter, and the compiler generates the expanded system — checked for dimensional consistency at every step.

This chapter introduces the abstractions, shows how they connect to external data, and works through a series of realistic examples drawn from measles, polio, STI, and malaria modeling.

The core abstraction

A dimension is a finite set of named levels:

dimensions {
  age   = [child, adult]
  patch = [north, south, east, west, center]
}

A stratification applies a dimension to compartments:

stratify(by = age)
stratify(by = patch)

By default, stratify(by = age) applies to all compartments — S, I, and R each gain an age index. This is the right default because in most models, every disease state needs to track every dimension of population heterogeneity (you need to know how many susceptible children there are, not just how many susceptibles).

When a dimension only applies to some compartments, use partial stratification with the only clause:

stratify(by = immunity, only = [R, V])

Here S, E, and I keep their existing dimensions, but R and V gain an extra immunity axis. The compiler then requires explicit routing into partially-stratified compartments — you must say I → R[natural], not just I → R. This is covered in detail in Partial stratification below.

Each stratify() call adds one dimension. You cannot write stratify(by = [age, patch]) — it’s a syntax error. This is deliberate: each dimension is added in a separate statement, and the order of stratify() calls determines the positional index order. After stratify(by = age) then stratify(by = patch), the first index is age and the second is patch: S[child, north].

After both stratifications, S has shape age × patch — ten concrete compartments for a two-age, five-patch model. The key rule:

Indexing works like marginalization. Think of a stratified compartment as a joint table over its dimensions. Bare S — no indices — marginalizes over everything: the total count across all strata. S[child] fixes age and marginalizes over patches: total susceptible children. S[child, north] is fully specified — no dimensions left to sum over. Omitting a dimension always means “sum over it,” never “use the current one.” The compiler never guesses which stratum you meant.

This is the no-auto-localization principle, and it prevents a large class of silent bugs in stratified models. In frameworks where bare S implicitly means “the current stratum’s S,” a misplaced variable quietly produces wrong rates. In camdl, you must be explicit — and the compiler checks that every index has the right type.

How other frameworks handle this

In a stratified model, every compartment reference is ambiguous: does S mean the current stratum’s susceptibles, the global total, or a specific stratum? Most frameworks resolve this through implicit context or untyped integer indexing. When the modeler’s intent doesn’t match the resolution rule, the model runs without error and produces quantitatively wrong output.

In pomp¹, multi-patch models require manually expanding compartments to S1, S2, S3 and writing C snippets that reference each by name — the modeler manages all indexing by hand, and a typo between two valid expanded names (S2 for S3) silently produces the wrong rate.² The spatPomp extension³ adds implicit unit-scoping for measurement components (bare C resolves to the current unit’s value), but this doesn’t extend to the process model, which still requires explicit S[u] indexing — two different resolution regimes in the same framework.

In odin⁴, stratified compartments are integer-indexed arrays. Bare S without an index is caught by the parser (error E2022⁵), so the silent-substitution failure mode doesn’t apply. But all indices are untyped integers: S[age_idx] and S[patch_idx] are the same operation to the language. Swapping them parses cleanly and only gets caught if the dimensions happen to differ in size (a bounds error at runtime, not a type error at compile time).⁶ Similarly, sum(S) vs. sum(S[i, ]) — global total vs. row-specific sum — is a semantic distinction the modeler manages by hand; the compiler can check rank but not intent.

The common thread: these frameworks catch syntactic errors (missing brackets, wrong rank) but not semantic errors (wrong dimension, swapped indices, incorrect aggregation level). camdl’s approach — bare names are always global sums, indices carry named dimension types, and the compiler rejects mismatches — catches both.

Rates are total propensity — the compiler checks

A related design choice: in camdl, the @ rate on every transition is the total propensity — the absolute event rate in units of population per time (P·T⁻¹). The compiler checks this dimensionally. There is no hidden per-capita multiplication.

If you write a per-capita rate where a total propensity is needed, the compiler catches it. The correct form and the common mistake:

# ✓ Correct: beta:T⁻¹ × S:P × I:P / N:P = P·T⁻¹
infection : S --> I @ beta * S * I / N

# ✗ Wrong: beta:T⁻¹ × I:P / N:P = T⁻¹ (missing the × S)
infection : S --> I @ beta * I / N

The second form is a per-capita rate (dimension T⁻¹) rather than a total propensity (dimension P·T⁻¹). The compiler’s dimensional checker infers types from parameter declarations (rate → T⁻¹, count → P, probability → dimensionless), propagates them through arithmetic, and rejects the mismatch at compile time with error E300.

This is the single most common modeling bug. In most other frameworks, it compiles without complaint and produces trajectories where infection happens at the wrong rate. The checker also verifies that arguments to exp() and log() are dimensionless, that additions don’t mix dimensions, and that table unit annotations are consistent.

Here is what the compiler produces when it catches an undeclared name — a typo like Q where you meant I:

$ camdl check error_test.camdl
error[E100]: undeclared name 'Q'

  ┌─ /Users/vsb/projects/work/camdl-book/guide/dimensions/error_test.camdl:18:34
  │
 18│    recovery  : I --> R  @ gamma * Q
  │                                   ^
  │
  = hint: check spelling, or add a declaration in compartments/parameters/let/tables

The error points to the exact file, line, and column, with a caret under the offending token and a hint suggesting what to check.

Tables: dimensionally-typed data

The bridge between dimensions and data is the table. A table declaration carries a type signature that says which dimensions index it:

tables {
  C_age : age × age      = [[12.0, 4.0], [4.0, 8.0]]
  pop   : patch           = read("data/pop.tsv")
  W     : patch × patch   = read("data/gravity.tsv", default = 0.0)
}

The annotation : age × age is not documentation — it is a type the compiler enforces. C_age[a, b] requires both a and b to be bound to the age dimension. Writing C_age[a, p] where p is a patch variable is a compile-time error. This catches a category of bug that in array-indexed frameworks (pomp, odin) only surfaces as wrong numbers at runtime.

Inline vs. file-based tables

Small tables — contact matrices, duration vectors — are written inline. Large tables — spatial kernels, patch populations, district-level demographics — are loaded from TSV/CSV files:

dimensions/seir_spatial_data.camdl (excerpt)

tables {
  pop, init_sus : patch          = read("data/demographics.tsv")
  adj           : patch × patch  = read("data/adjacency.tsv", default = 0.0)
}

File paths are relative to the model file

Paths in read() resolve relative to the .camdl file’s own directory, not the shell’s working directory. Running camdl compile models/mymodel.camdl from the repo root resolves read("data/contact.tsv") to models/data/contact.tsv. This is the same convention as CSS url(), Python’s __file__-relative imports, and R’s here::here() — the file’s location is the anchor, so models work regardless of where you invoke the compiler from.

The file is long-format: index columns followed by value columns. The number of index columns is determined by the type signature — one column per dimension in the ×-separated type. Here is what the files for the declarations above look like:

dimensions/data/demographics.tsv

patch   pop init_sus
north   50000   0.90
south   30000   0.85
east    20000   0.80

The type signature pop, init_sus : patch tells the compiler: one index column (patch), two value columns (pop and init_sus). Multiple value columns loaded from the same file create separate tables sharing the same index.

For a two-dimensional table like the adjacency matrix:

dimensions/data/adjacency.tsv

patch   patch   weight
north   south   0.10
north   east    0.05
south   north   0.10
south   east    0.08
east    north   0.05
east    south   0.08

The type signature adj : patch × patch means two index columns (both patch), one value column. Each row is a (source, destination, weight) triple. The compiler validates that every index value is a declared level of its dimension — a typo like nroth instead of north is a compile error with a Levenshtein-distance suggestion.

default = 0.0 marks a sparse table — index combinations missing from the file get the default value rather than raising an error. Without default, the compiler checks that every combination is present (a dense check). For a 774-patch spatial model, a dense gravity matrix would need 774² ≈ 600,000 rows; default = 0.0 lets you list only the nonzero entries.

Verifying tables with `camdl inspect`

After compilation, camdl inspect --tables shows exactly what the compiler loaded — values, dimensions, and source file. This is the primary way to verify that external data parsed correctly before running simulation or inference:

$ camdl inspect seir_spatial_data.camdl --tables
pop  [patch]  loaded: data/demographics.tsv
  │ north  50000
  │ south  30000
  │ east   20000

init_sus  [patch]  loaded: data/demographics.tsv
  │ north  0.9
  │ south  0.85
  │ east   0.8

adj  [patch × patch]  loaded: data/adjacency.tsv
  │         north  south   east
  │  north      0    0.1   0.05
  │  south    0.1      0   0.08
  │  east    0.05   0.08      0

The pop, init_sus multi-value declaration produced two separate tables from the same file. The adjacency matrix — loaded sparse with default = 0.0 — is rendered as a full matrix with zeros filled in, making the connectivity pattern immediately visible. The loaded: and inline annotations tell you whether values came from a file or from the DSL source.

Data-derived dimensions

For models with many patches (Nigerian LGAs, London boroughs), you don’t want to type out hundreds of level names. Derive them from the data:

dimensions {
  patch = read("data/lga_pop.tsv", column = "patch")
}

The compiler reads the named column, collects unique values in first-occurrence order, and those become the dimension levels. Every table and transition referencing patch is validated against these levels. Change the data file, and the model’s spatial resolution changes with it — no code edits needed.

Contact mixing patterns

The structure of a dimension’s mixing is encoded in its coupling table. Different diseases demand different patterns, and camdl handles them all through the same mechanism: a table indexed by dim × dim, used inside a sum() in the force of infection.

Dense symmetric: childhood infections

For measles or pertussis, age-structured contact drives dynamics. Children contact each other far more than they contact adults, but cross-group mixing still matters:

dimensions/seir_age_measles.camdl

# Measles-like: age-structured SEIR with WAIFW contact matrix
time_unit = 'days

compartments { S, E, I, R }

dimensions {
  age = [under5, school_age, adult]
}

stratify(by = age)

parameters {
  beta     : rate         in [0.01, 1.0]
  sigma    : rate         in [0.05, 1.0]
  gamma    : rate         in [0.01, 0.5]
}

tables {
  # WAIFW matrix (who-acquires-infection-from-whom)
  # School-age children drive transmission; adults less so.
  C_age : age × age = [[4.0,  8.0,  1.5],
                        [8.0, 14.0,  3.0],
                        [1.5,  3.0,  5.0]]
}

let N[a in age] = S[a] + E[a] + I[a] + R[a]

transitions {
  infection[a in age] : S[a] --> E[a]
    @ beta * S[a] * sum(b in age, C_age[a, b] * I[b] / N[b])

  progression[a in age] : E[a] --> I[a]  @ sigma * E[a]
  recovery[a in age]    : I[a] --> R[a]  @ gamma * I[a]
}

The sum(b in age, C_age[a, b] * I[b] / N[b]) computes the age-weighted force of infection: for each age group a, sum over all groups b the contact rate times the prevalence in b. The WAIFW matrix here is symmetric (contacts are reciprocal), but it need not be.

camdl inspect confirms the expansion — 3 age groups × 4 compartments = 12 expanded compartments, and the inline contact matrix:

$ camdl inspect seir_age_measles.camdl
seir_age_measles

  compartments   4 base × 3 age = 12 expanded
  transitions     3 base → 9 expanded (+ 0 filtered by where)
  parameters      3 declared (3 rate)
  tables          1 (C_age: age × age)
  let bindings    1 (N[a in age])
  dimensions      age = [under5, school_age, adult]
  observations    0 streams
  interventions   0 (0 active by default)

The --dims flag shows the dimensional analysis the compiler inferred for each parameter — useful for verifying that rate expressions have the right units:

$ camdl inspect seir_age_measles.camdl --dims
parameters (inferred dimensions):
  beta  : rate → T^-1 (per-capita rate)
  sigma : rate → T^-1 (per-capita rate)
  gamma : rate → T^-1 (per-capita rate)

Off-diagonal: sexually transmitted infections

The simplest STI models partition the population by sex and restrict transmission to cross-sex contacts — the contact matrix has zeros on the diagonal. This is a pedagogical simplification, not an epidemiological claim: gonorrhea and syphilis are transmitted within MSM populations, and realistic models would use an activity-level stratification (high/low risk) with a dense contact matrix rather than a strict male/female partition. But the off-diagonal pattern illustrates a general DSL feature — parameterized table entries with structural zeros:

dimensions/sis_sex_gonorrhea.camdl

# Gonorrhea-like: sex-structured SIS with directed transmission
time_unit = 'days

compartments { S, I }

dimensions {
  sex = [female, male]
}

stratify(by = sex)

parameters {
  beta_fm : rate  in [0.01, 0.5]    # female-to-male transmission
  beta_mf : rate  in [0.01, 0.5]    # male-to-female transmission
  gamma   : rate  in [0.01, 0.5]
}

tables {
  # Off-diagonal only: no within-sex transmission.
  B_sex : sex × sex = [[0.0,     beta_mf],
                        [beta_fm, 0.0    ]]
}

let N[s in sex] = S[s] + I[s]

transitions {
  infection[s in sex] : S[s] --> I[s]
    @ S[s] * sum(r in sex, B_sex[s, r] * I[r] / N[r])

  recovery[s in sex] : I[s] --> S[s]  @ gamma * I[s]
}

The table entries are parameter references, not floats — beta_fm and beta_mf are resolved at runtime. This lets the off-diagonal rates be estimated independently. The same mechanism — parameterized table entries — works for any contact matrix where you want to infer the mixing rates.

Spatial gravity coupling: polio, measles metapopulations

For patch-structured models, transmission is local plus importation from other patches. The coupling strength depends on distance or connectivity — a gravity kernel:

dimensions/seir_spatial_polio.camdl

# Polio-like: 5-patch SEIR+V with gravity importation
time_unit = 'days

compartments { S, E, I, R, V }

dimensions {
  patch = [north, south, east, west, center]
}

stratify(by = patch)

let N[p in patch] = S[p] + E[p] + I[p] + R[p] + V[p]

parameters {
  sigma     : rate        in [0.01,  1.0]
  gamma     : rate        in [0.01,  1.0]
  kappa     : rate        in [0.0,   0.5]
  R0[patch] : positive    in [0.5,  10.0]  # patch-specific R0
  N0[patch] : count       in [100, 1000000]
  I0        : count       in [1,     1000]
}

let beta[p in patch] = R0[p] * gamma

tables {
  W : patch × patch = [[0.00, 0.10, 0.05, 0.05, 0.15],
                        [0.10, 0.00, 0.10, 0.05, 0.10],
                        [0.05, 0.10, 0.00, 0.10, 0.10],
                        [0.05, 0.05, 0.10, 0.00, 0.15],
                        [0.15, 0.10, 0.10, 0.15, 0.00]]
}

transitions {
  # Local transmission
  infection[p in patch] : S[p] --> E[p]
    @ beta[p] * S[p] * I[p] / N[p]

  # Cross-patch importation (where p != q excludes self-loops)
  importation[p in patch, q in patch] : S[p] --> E[p]
    @ kappa * W[p, q] * S[p] * I[q] / N[q] where p != q

  progression[p in patch] : E[p] --> I[p]  @ sigma * E[p]
  recovery[p in patch]    : I[p] --> R[p]  @ gamma * I[p]
}

Two things to note. First, R0[patch] is an indexed parameter — one scalar per patch, estimated independently. The compiler mangles R0[north] to R0_north in the IR; you supply per-patch values via --param-vec. Second, the where p != q clause on importation excludes self-infection, so the gravity matrix’s diagonal zeros are enforced structurally, not just by convention.

For real deployments, the coupling matrix comes from data:

dimensions {
  patch = read("data/lga_pop.tsv", column = "patch")
}

tables {
  W : patch × patch = read("data/gravity_kernel.tsv", default = 0.0)
}

Change the data file from 5 patches to 774 Nigerian LGAs, and the model scales without editing a single line of camdl.

Within-stratum only: independent patches

The simplest spatial structure is no coupling at all — each patch evolves independently. This is the identity matrix case. You don’t need a coupling table; just stratify and write per-patch transitions:

dimensions {
  patch = [urban, rural]
}

stratify(by = patch)

transitions {
  infection[p in patch] : S[p] --> I[p]
    @ beta * S[p] * I[p] / N[p]
}

No sum(), no coupling table. Each patch gets its own copy of the transition, and I[p] / N[p] is local prevalence. Useful as a baseline before adding spatial coupling.

Partial stratification

Not every dimension applies to every compartment. Vaccination history matters for where recovered individuals go, but you don’t need to track vaccination status through the S → E → I progression. camdl handles this with the only clause:

# Polio-like: immunity type only on R and V
dimensions {
  immunity = [natural, vaccine]
}

stratify(by = immunity, only = [R, V])

After this, S, E, and I have their existing dimensions (say, age × patch), but R and V gain an extra immunity axis. The compiler then requires explicit routing in any transition that enters R or V — you must say where recovered individuals go:

transitions {
  recovery[a in age, p in patch] : I[a, p] --> R[a, p, natural]
    @ gamma * I[a, p]

  vaccination[a in age, p in patch] : S[a, p] --> V[a, p, vaccine]
    @ vacc_rate * S[a, p]
}

Stoichiometry requires all dimensions

Writing I[a, p] --> R[a, p] without the immunity index is a compile-time error — the destination R has dimensions [age, patch, immunity] but only [age, patch] were specified. You must write R[a, p, natural] to tell the compiler exactly which stratum receives the individual. This rule applies to all transitions into partially-stratified compartments: omitting a dimension in a rate expression (right of @) sums over it, but omitting one in stoichiometry (left of @) is always an error.

Disease stages as partial dimensions

The same mechanism models Erlang-distributed waiting times. An SEIR where the latent period is Erlang-3 (three serial stages) uses a dimension on E only:

dimensions/seir_erlang.camdl (excerpt)

dimensions {
  latent_stage = [e1, e2, e3]
}

stratify(by = latent_stage, only = [E])

tables {
  sigma_stage : latent_stage = [sigma_e1, sigma_e2, sigma_e3]
}

transitions {
  infection : S --> E[e1]
    @ beta * S * I / N

  latent[(s, s_next) in consecutive(latent_stage)]
    : E[s] --> E[s_next]
    @ sigma_stage[s] * E[s]

  onset    : E[e3] --> I  @ sigma_stage[e3] * E[e3]
  recovery : I     --> R  @ gamma * I
}

consecutive(latent_stage) yields pairs (e1, e2), (e2, e3). The two-line latent declaration above is equivalent to writing each transition by hand:

  latent_e1 : E[e1] --> E[e2]  @ sigma_stage[e1] * E[e1]
  latent_e2 : E[e2] --> E[e3]  @ sigma_stage[e2] * E[e2]

For an Erlang-3 the saving is modest; for an Erlang-10 it eliminates nine lines of boilerplate that differ only in the index. The compiler generates one transition per consecutive pair, and --transitions shows exactly what was produced (see below).

Infection enters at E[e1], progression walks through stages, and onset exits from the last stage. S, I, and R are unstratified — which camdl inspect confirms:

$ camdl inspect seir_erlang.camdl
seir_erlang

  compartments   4 base × 3 latent_stage = 6 expanded
  transitions     4 base → 5 expanded (+ 0 filtered by where)
  parameters      5 declared (5 rate)
  tables          1 (sigma_stage: latent_stage)
  let bindings    1 (N)
  dimensions      latent_stage = [e1, e2, e3]
  observations    0 streams
  interventions   0 (0 active by default)

The summary reads “4 base × 3 latent_stage = 6 expanded” — not 12, because only E was stratified. --compartments makes the mixed arities visible — S, I, and R have no index ([]), while E has [latent_stage]:

$ camdl inspect seir_erlang.camdl --compartments
S   integer   []   → S
E   integer   [latent_stage]   → E[e1], E[e2], E[e3]
I   integer   []   → I
R   integer   []   → R

6 expanded compartments (4 base × 3 latent_stage)

And --transitions shows how the compiler expanded the consecutive() chain and routed infection into E[e1]:

$ camdl inspect seir_erlang.camdl --transitions
infection → 1 transition
  │ infection : S → E[e1]   @ beta × S × I / S + E[e1] + E[e2] + E[e3] + I + R

latent[(s, s_next) in consecutive(latent_stage)] → 2 transitions
  │ latent_e1 : E[e1] → E[e2]   @ sigma_stage[0] × E[e1]
  │ latent_e2 : E[e2] → E[e3]   @ sigma_stage[1] × E[e2]

onset → 1 transition
  │ onset : E[e3] → I   @ sigma_stage[2] × E[e3]

recovery → 1 transition
  │ recovery : I → R   @ gamma × I

Aging and demography

Fine-grained age structure requires more than a contact matrix — it needs aging transitions that move individuals through age groups over time, plus births and deaths. camdl uses the same consecutive() mechanism:

dimensions/sir_age_demography.camdl (excerpt)

dimensions {
  age = [age_0_5, age_5_15, age_15_50, age_50_65, age_65p]
}

stratify(by = age)

tables {
  age_dur : age 'years = [5, 10, 35, 15, 20]
}

let N_local[a in age] = S[a] + I[a] + R[a]

transitions {
  # Aging: consecutive pairs across all compartments
  aging[c in compartments, (a, a_next) in consecutive(age)]
    : c[a] --> c[a_next]
    @ (1 / age_dur[a]) * c[a]

  # Death: age-specific, all compartments
  death[c in compartments, a in age] : c[a] -->
    @ mu * c[a]

  # Birth: into youngest age group
  birth : --> S[age_0_5]
    @ mu * sum(a in age, N_local[a])
}

The [c in compartments] quantifier iterates over all compartments — every compartment gets aging and death transitions. age_dur is a dimensioned table with a unit annotation ('years), giving the duration spent in each age class. The combinatorial expansion is substantial:

$ camdl inspect sir_age_demography.camdl
sir_age_demography

  compartments   3 base × 5 age = 15 expanded
  transitions     5 base → 38 expanded (+ 0 filtered by where)
  parameters      3 declared (3 rate)
  tables          1 (age_dur: age)
  let bindings    1 (N_local[a in age])
  dimensions      age = [age_0_5, age_5_15, age_15_50, age_50_65, age_65p]
  observations    0 streams
  interventions   0 (0 active by default)

5 transition templates expand to 38 concrete transitions: aging generates 3 compartments × 4 consecutive pairs = 12, death generates 3 × 5 = 15, plus infection (5), recovery (5), and birth (1).

Note the interaction with partial stratification: if R has an extra immunity dimension that S and I don’t have, death[c in compartments, a in age] : c[a] --> does the right thing automatically. For S (dims: [age]), it generates one death transition per age group. For R (dims: [age, immunity]), it generates separate transitions for each (age, immunity) combination — R[child, natural], R[child, vaccine], etc. The compiler fills in the omitted dimensions by iterating over them. You write the compact form; the expansion handles the arity differences.

Guard clauses

Several examples above use where p != q to exclude self-loops from spatial importation. This is a guard clause — a compile-time filter on which index combinations generate transitions:

# Migration: exclude self-loops
migrate[c in compartments, a in age, src in patch, dst in patch]
  : c[a, src] --> c[a, dst]
  @ mig[dst, src] * c[a, src]
  where src != dst

# Only adults reproduce
birth[a in age, p in patch] : --> S[child, p]
  @ fertility[a] * N_local[a, p]
  where a != child

# Compound guard
transfer[a in age, src in patch, dst in patch] : S[a, src] --> S[a, dst]
  @ rate * S[a, src]
  where src != dst and a == adult

Guards are evaluated at compile time — the compiler instantiates all index combinations, evaluates the guard for each, and emits IR transitions only for those that pass. The runtime never sees the guard; it just gets the filtered set of transitions. Guards compose with all iteration forms: [i in dim], consecutive(), and [c in compartments].

Multiple dimensions: putting it together

Real models combine several dimensions. A spatial age-structured measles model might have age × patch with a contact matrix over age and a gravity kernel over patches. The force of infection nests both sums:

dimensions {
  age   = [child, adult]
  patch = read("data/pop.tsv", column = "patch")
}

stratify(by = age)
stratify(by = patch)

tables {
  C_age : age × age     = [[12.0, 4.0], [4.0, 8.0]]
  W     : patch × patch  = read("data/gravity.tsv", default = 0.0)
  pop   : patch           = read("data/pop.tsv")
}

let N[a in age, p in patch] = S[a, p] + E[a, p] + I[a, p] + R[a, p]

transitions {
  infection[a in age, p in patch] : S[a, p] --> E[a, p]
    @ beta * S[a, p]
      * sum(b in age, sum(q in patch,
          C_age[a, b] * W[p, q] * I[b, q] / N[b, q]
        ))
}

Each dimension contributes one sum(). The compiler expands the nested iteration at compile time — for 2 age groups and 50 patches, the infection rate for each (a, p) stratum becomes a sum of 100 terms, all type-checked against the declared dimensions of C_age and W.

The seir_cross_dim.camdl example in this chapter’s files puts all of these features together — patch × age stratification, contact matrix, spatial adjacency, demography — and camdl inspect shows the full expansion:

$ camdl inspect seir_cross_dim.camdl
seir_cross_dim

  compartments   4 base × 4 patch × 3 age = 48 expanded
  transitions     6 base → 120 expanded (+ 0 filtered by where)
  parameters      5 declared (5 rate)
  tables          4 (pop_pa: patch × age, adj: patch × patch, age_dur: age, C: age × age)
  let bindings    3 (N[p in patch, a in age], N_total[p in patch], I_total[p in patch])
  dimensions      patch = [north, south, east, west], age = [child, adult, elder]
  observations    0 streams
  interventions   0 (0 active by default)

$ camdl inspect seir_cross_dim.camdl --tables
pop_pa  [patch × age]  loaded: data/patch_age_pop3.tsv
  │          child   adult   elder
  │  north   22000   72000   26000
  │  south   16000   52000   17000
  │  east     8500   28000    8500
  │  west    38000  124000   38000

adj  [patch × patch]  loaded: data/spatial_adj.tsv
  │         north  south   east   west
  │  north      0  0.008  0.003      0
  │  south  0.008      0  0.005  0.012
  │  east   0.003  0.005      0      0
  │  west       0  0.012      0      0

age_dur  [age]  inline
  │ child  5475
  │ adult  16425
  │ elder  36500

C  [age × age]  inline
  │         child  adult  elder
  │  child     18      6      1
  │  adult      6     12      2
  │  elder      1      2      5

6 transition templates expand to 120 transitions across 48 compartments. The tables show both file-loaded data (loaded:) and inline values (inline), with 2D tables rendered as matrices for immediate verification.

Feature inventory: which diseases need what

The table below maps common modeling patterns to the camdl features that support them. Each row is a pattern seen in practice; the examples show which diseases motivate it.

Pattern	camdl feature	Disease examples
Age-structured contact	`dim × dim` table + `sum()`	Measles, pertussis, COVID-19
Directed (cross-group only)	Off-diagonal contact matrix	Gonorrhea, syphilis, HIV
Spatial metapopulation	`patch × patch` gravity kernel	Polio, measles, cholera
Data-derived patches	`read("file", column = ...)`	Any sub-national model
Indexed reproduction number	`R0[patch] : positive`	Polio, malaria (heterogeneous transmission)
Consecutive aging	`consecutive(age)` + `age_dur` table	Any age-structured endemic model
Erlang-distributed stages	`stratify(by = stage, only = [E])`	Measles (latent), malaria (liver stages)
Vaccination as partial dim	`stratify(by = immunity, only = [R, V])`	Polio (OPV/IPV), COVID-19
Seasonal forcing	Time-dependent rate expression	Measles (school terms), cholera (monsoon)
Campaign interventions on strata	`transfer(from = S[patch], to = V[patch])`	Polio SIAs, MDA for malaria
Sparse coupling matrix	`read("file", default = 0.0)`	Any large spatial model
Parameterized mixing rates	Table entries as parameter refs	STIs (sex-specific β), flu (age-specific)

Design philosophy

camdl’s dimension system reflects a few deliberate choices:

Dimensions are type-level, not value-level. The levels of a dimension are known at compile time (either declared inline or read from a file during compilation). This means the compiler can expand all indexed transitions, check all table lookups, and verify population accounting before any simulation runs. There is no runtime dimension discovery.

Tables are the interface between data and model. External data enters the model through dimensionally-typed tables, not through untyped arrays or global variables. The dimension annotation on a table is a contract: it says what the rows and columns mean, and the compiler enforces it. This makes the data pipeline part of the model’s type system, catching mismatches (wrong number of patches, misspelled level names) at compile time rather than producing silently wrong simulation output.

Explicit over implicit. You write sum(b in age, C[a,b] * I[b] / N[b]) rather than declaring a mixing mode. This is more verbose than a coupling = "frequency_dependent" flag, but it means the rate expression is always readable as math — no hidden multiplication, no framework magic. The transition rate is the equation.

Partial stratification forces explicit routing. When a dimension applies to some compartments but not others, every transition into a partially-stratified compartment must specify which stratum the population flows to. This is annoying for the modeler and essential for correctness — it prevents the framework from silently distributing individuals across strata by some default rule.

These choices trade convenience for safety. For a 3-compartment teaching model, the overhead is negligible. For a 774-patch, 5-age-group, multi-intervention polio model, the compiler’s checking is the difference between a model that works and one that silently computes the wrong force of infection for six months before anyone notices.

King AA, Nguyen D, Ionides EL (2016). “Statistical inference for partially observed Markov processes via the R package pomp.” Journal of Statistical Software, 69(12), 1–43. doi:10.18637/jss.v069.i12.↩︎
The manual expansion pattern is documented in the pomp FAQ and discussed in kingaa/pomp#11, where a user asks whether vector-valued state is supported. The canonical workaround uses C pointer arithmetic over statenames=sprintf("X%d",1:n). A typo between two valid expanded names (e.g. S2 for S3) compiles without error because both are declared statenames; the silent failure requires both names to exist, which is exactly the situation in any stratified model with more than one stratum.↩︎
Asfaw K, Ionides EL et al. (2024). spatPomp: Statistical Inference for Spatiotemporal Partially Observed Markov Processes. CRAN package. The dunit_measure C snippet receives a predefined integer u (current unit index, 0 to U−1) and bare unit_statenames resolve to unit u’s values. The rprocess snippet does not — it requires explicit S[u], E[u] indexing. See the spatPomp tutorial vignette, §3 (unit-level measurement models).↩︎
FitzJohn R et al. (2024). odin2: an R package for generating dynamical models. CRAN package odin2. See also dust2 (FitzJohn R et al., 2024) for the compiled simulation backend.↩︎
odin2 error E2022: “Trying to access an array without using square bracket indexes.” Documented at mrc-ide.github.io/odin2/articles/errors.html#e2022. This error exists in both odin1 (tested in test-parse2-general.R, "Array 'x' used without array index") and odin2. An earlier version of this chapter incorrectly stated that bare array references silently resolved to the first element; we are grateful to reviewers for the correction.↩︎
The untyped-index hazard: if dim(S) <- c(n_age, n_patch) and you write S[j, i] instead of S[i, j], it parses cleanly. When n_age == n_patch it runs with transposed values; when they differ, you get a runtime bounds error rather than a compile-time type error. The same issue affects reduction operations — sum(S) vs. sum(S[i, ]) are semantically different (global total vs. row sum) but the compiler cannot distinguish intent from syntax.↩︎