Food Web Generation and Measurement I

Links and Detection

community ecology
food webs
hierarchical modeling
Published

November 17, 2025

Introduction

Food webs are often represented as network graphs, where nodes represent species and links represent observed feeding interactions, or potential feeding interactions inferred from species traits. The Web of Life database houses many such graphs, showcasing their interesting structural patterns and the tremendous efforts ecologists have gone to in order to collect these challenging data.

A particular challenge when collecting food web data is knowing whether sampling effort is sufficient. Consider the implications of missing some feeding links. If an interaction involving even a single rare species goes unobserved, it can impact the overall network structure. Since many theories of food webs focus directly on explaining the structure of food webs, measurement models are needed that can help us infer whether the observed structure results from fundamental processes of community assembly or from partial sampling. In this blog post, we explore this measurement problem and develop some preliminary simulations and validations.

To begin, let’s lay out a basic model. Let’s assume that we want to infer species interactions using a hierarchical model (multiple food webs). This let’s us pool information from across multiple webs to understand our ability to detect trophic links.

For each food web \(k = 1,...,K\), we observe pairs of species interactions \(A^{\text{obs}}_{ij,k} \in \{0,1\}\). A value of \(1\) indicates the presence of a link between \(i \rightarrow j\). Given this, we model the probability of a link \(\pi_{ij,k}\) as

\[ \begin{equation} \text{logit}(\pi_{ij,k}) = \alpha_k + u_{i,k} + v_{j,k} \end{equation} \]

where \(\alpha_k\) is a baseline interaction rate for web \(k\), and \(u_{i,k}\) and \(v_{j,k}\) are consumer- and resource-specific random effects. We could then say that the “true” set of interactions is Bernoulli distributed

\[ \begin{equation} A^{\text{obs}}_{ij,k} \sim \text{Bernoulli}(\pi_{ij,k}) \end{equation} \] In other words, there is a probability \(\pi_{ij,k}\) that we observe a link between \(ij\) in food web \(k\). Such a probability could be directly computed from \(A\). But if sampling is biased, then this probability will inherit this bias. For this reason, we also need a detection model.

\[ \begin{equation} \text{logit}(p_k) = \mu_p + \sigma_p p^{(z)}_k \end{equation} \]

We assume that each web has its own rate of detection \(p_k\). The parameter \(\mu_p\) is the average log-odds of detection across all food webs with a specific deviation of web \(k\) from the global average, which we model as non-centered: \(\sigma_p p^{(z)}_k\).

It follows that the probability of observing a link is

\[ \begin{equation} q_{ij,k} = p_k \cdot \pi_{ij,k} \end{equation} \] We use this to amend our original likelihood to

\[ \begin{equation} A^{\text{obs}}_{ij,k} \sim \text{Bernoulli}(q_{ij,k}) \end{equation} \]

In essence, this model assumes that

  • If \(A^{\text{true}}_{ij,k} = 0\), then \(A^{\text{obs}}_{ij,k} = 0\).
  • If \(A^{\text{true}}_{ij,k} = 1\), then we observe \(A^{\text{obs}}_{ij,k} = 1\) with probability \(\pi_{ij,k}\).

Although false positives are also a possible source of bias, we leave this out of the model, for now.

Finally, we need to declare hierarchical, non-centered priors for all parameters, beginning first with the interaction rates, then the detection probabilities, and finally the random effects. We assume a weakly informative \(\mathcal{N}(0,1)\) prior on all latent deviations, and \(\text{Exponential}(1)\) for all SD.

Interaction rates

\[ \begin{align} \pi_{ij,k} &= \text{logit}^{-1}(\alpha_k + u_{i,k} + v_{j,k}) \qquad &\text{true link probability} \\ \alpha_k &= \mu_{\alpha} + \sigma_{\alpha} \alpha^{(z)}_k \qquad &\text{non-centered} \\ \alpha^{(z)} &\sim \mathcal{N}(0,1) \qquad &\text{latent deviation} \\ \mu_{\alpha} &\sim \mathcal{N}(0,1) \qquad &\text{global mean interaction rate} \\ \sigma_{\alpha} &\sim \text{Exponential}(1) \qquad &\text{global interaction SD} \end{align} \]

Detection

\[ \begin{align} p_k &= \text{logit}^{-1}(\mu_{p} + \sigma_{p} p^{(z)}_{k}) \qquad &\text{non-centered} \\ p^{(z)}_k &\sim \mathcal{N}(0,1) \qquad &\text{latent deviation} \\ \mu_{p} &\sim \mathcal{N}(0,1) \qquad &\text{global mean detection rate} \\ \sigma_{p} &\sim \text{Exponential}(1) \qquad &\text{global detection SD} \end{align} \]

Consumer and resource random effects

\[ \begin{align} u_{i,k} &= \sigma_u u^{(z)}_{i,k}, \qquad &u^{(z)}_{i,k} \sim \mathcal{N}(0,1) \\ v_{j,k} &= \sigma_v v^{(z)}_{j,k}, \qquad &v^{(z)}_{j,k} \sim \mathcal{N}(0,1) \\ \sigma_u &\sim \text{Exponential}(1) \qquad &\sigma_v \sim \text{Exponential}(1) \end{align} \]

Simulation

To test this model, we need to simulate food webs to serve as the “true” sets of interactions, and “observe” our food webs probabilistically from them. Here are the packages used in this workflow.

library(tidyverse)
library(tidygraph)
library(ggraph)
library(graphlayouts)
library(igraph)
library(patchwork)
library(latex2exp)

We begin by setting up our parameters.

# ---- Setup ----
set.seed(777)
inv_logit = function(x) exp(x) / (1 + exp(x))

# ---- Parameters ----
K = 3   
S = 30
mu_alpha = -2
sigma_alpha = 1
sigma_u = 2
sigma_v = 1
mu_p = -1
sigma_p = 0.5

# generate varying base rates and detection probabilities 
alpha_k = rnorm(K, mu_alpha, sigma_alpha); alpha_k
[1] -1.510214 -2.398541 -1.489164
p_k = inv_logit(rnorm(K, mu_p, sigma_p)); p_k
[1] 0.2315809 0.4549582 0.3341748

We then set up some containers for the simulation results.

# ---- Outputs ----
A_true = array(0, dim = c(S,S,K))
A_obs  = array(0, dim = c(S,S,K))

The basic procedure is the following:

  • Generate \(u_{i,k}\) and \(v_{j,k}\).
  • Create \(S \times S\) matrices for the “true” and “observed” webs.
  • Generate \(\pi_{ij,k}\) based on \(\alpha_k\) + random effects.
  • Adjust observed links based on \(p_k \cdot \pi_{ij,k}\).
for(k in 1:K) {
  
  # Random effects
  u_i = rnorm(S, mean = 0, sd = sigma_u)
  v_j = rnorm(S, mean = 0, sd = sigma_v)
  
  # Loop 
  for(i in 1:S) for(j in 1:S) {
    
    pi_ijk = inv_logit(alpha_k[k] + u_i[i] + v_j[j])
    
    A_true[i,j,k] = rbinom(1, 1, pi_ijk)
    A_obs[i,j,k] = rbinom(1, 1, p_k[k] * pi_ijk)
  }
}

This yields the following summary.

dat = data.frame(
  web            = 1:K, 
  true_links     = apply(A_true, 3, sum), 
  observed_links = apply(A_obs,  3, sum), 
  detection      = p_k, 
  alpha          = alpha_k
); dat
  web true_links observed_links detection     alpha
1   1        297             55 0.2315809 -1.510214
2   2        124             49 0.4549582 -2.398541
3   3        185             66 0.3341748 -1.489164

If we examine the degree distributions across all of the webs, we can see how they differ after we adjust for the probability of detection.

deg = function(A) apply(A, 3, function(mat) rowSums(mat)) |>  as.vector()

data.frame(
  True      = deg(A_true),
  Observed  = deg(A_obs)
) |> 
  gather(key = key, value = degree) |> 
  ggplot(aes(degree)) + 
  geom_histogram(color='white', fill='black') + 
  facet_wrap(~key) + 
  labs(x = 'Outgoing links', y = 'Count')

The out-degree distribution across all simulated webs.

We can see that the degree distribution of the observed links is more zero-inflated and that the tail of the distribution has contracted. Altering the values of \(\sigma_u\) and \(\sigma_v\) can change these properties to some extent, especially the heaviness of the tail.

A_true1 = A_true[,,1]
A_obs1  = A_obs[,,1]

row_sums = rowSums(A_true1)
sort_order = order(row_sums)

par(mfrow=c(1,2))
image(A_true1[sort_order, sort_order], 
      col = c('black','tomato'), 
      main = 'True', 
      xaxt = 'n', yaxt = 'n')
image(A_obs1[sort_order, sort_order],  
      col = c('black','tomato'), 
      main = 'Observed', 
      xaxt = 'n', yaxt = 'n')

Examples of true and observed adjacency matrices.

Model Validation

The next step is to develop the hierarchical model in Stan and validate it using the true and observed interaction matrices. First, we define the model block to mirror our simulation and the model notation. This block include the priors for all parameters and hyperparameters.

model {
  // priors for latent z-parameters 
  p_z ~ normal(0,1); 
  alpha_z ~ normal(0,1);
  to_vector(u_z) ~ normal(0,1); 
  to_vector(v_z) ~ normal(0,1); 
  
  //priors for hyperparameters 
  mu_p ~ normal(0,1);
  sigma_p ~ exponential(1);
  mu_alpha ~ normal(0,1);
  sigma_alpha ~ exponential(1); 
  sigma_u ~ exponential(1);
  sigma_v ~ exponential(1);
  
  // likelihood 
  for(k in 1:K) {
    // detection random effect
    real p_k = inv_logit(mu_p + sigma_p * p_z[k]);
    
    // link random effect
    real alpha_k = mu_alpha + sigma_alpha * alpha_z[k]; 
    
    for(i in 1:N_cons) {
      // consumer random effects 
      real u_ik = sigma_u * u_z[i,k];
      
      for(j in 1:N_res) {
        // resource random effect 
        real v_jk = sigma_v * v_z[j,k];
        
        // link probability 
        real pi_ijk = inv_logit(alpha_k + u_ik + v_jk);
        
        // detection adjusted probability 
        real q_ijk = p_k * pi_ijk;
        
        // likelihood
        A[i,j,k] ~ bernoulli(q_ijk);
      }
    }
  }
}

For all parameters that have priors, we declare them in the parameters block.

parameters {
  vector[K] p_z;
  vector[K] alpha_z;
  matrix[N_cons, K] u_z;
  matrix[N_res, K] v_z;
  
  real mu_p; 
  real<lower=0> sigma_p; 
  real mu_alpha; 
  real<lower=0> sigma_alpha; 
  real<lower=0> sigma_u; 
  real<lower=0> sigma_v;
}

Finally, we set up the data needed for the model.

data_list = list(
  N_cons = S, 
  N_res  = S, 
  K      = K, 
  A      = A_obs
)

This corresponds to

data {
  int<lower=1> N_cons;
  int<lower=1> N_res;
  int<lower=1> K;
  // observed adjacency
  array[N_cons, N_res, K] int<lower=0, upper=1> A; 
}

View the full stan model.


data {
  int<lower=1> N_cons;
  int<lower=1> N_res;
  int<lower=1> K;
  // observed adjacency
  array[N_cons, N_res, K] int<lower=0, upper=1> A; 
}


parameters {
  // non-centered latent effects
  vector[K] p_z;
  vector[K] alpha_z;
  matrix[N_cons, K] u_z;
  matrix[N_res, K] v_z;
  
  // hyperparameters
  real mu_p; 
  real<lower=0> sigma_p; 
  real mu_alpha; 
  real<lower=0> sigma_alpha; 
  real<lower=0> sigma_u; 
  real<lower=0> sigma_v;
}

model {
  // priors for latent z-parameters 
  p_z ~ normal(0,1); 
  alpha_z ~ normal(0,1);
  to_vector(u_z) ~ normal(0,1); 
  to_vector(v_z) ~ normal(0,1); 
  
  //priors for hyperparameters 
  mu_p ~ normal(0,1);
  sigma_p ~ exponential(1);
  mu_alpha ~ normal(0,1);
  sigma_alpha ~ exponential(1); 
  sigma_u ~ exponential(1);
  sigma_v ~ exponential(1);
  
  // likelihood 
  for(k in 1:K) {
    // detection random effect
    real p_k = inv_logit(mu_p + sigma_p * p_z[k]);
    
    // link random effect
    real alpha_k = mu_alpha + sigma_alpha * alpha_z[k]; 
    
    for(i in 1:N_cons) {
      // consumer random effects 
      real u_ik = sigma_u * u_z[i,k];
      
      for(j in 1:N_res) {
        // resource random effect 
        real v_jk = sigma_v * v_z[j,k];
        
        // link probability 
        real pi_ijk = inv_logit(alpha_k + u_ik + v_jk);
        
        // detection adjusted probability 
        real q_ijk = p_k * pi_ijk;
        
        // likelihood
        A[i,j,k] ~ bernoulli(q_ijk);
      }
    }
  }
}

After saving this model, we fit it using cmdstanr like so. This model samples in under a minute with 4 parallel chains when S = 30 and K = 3.

library(cmdstanr)
mod <- cmdstan_model("detection_model.stan")

fit1 <- mod$sample(
  data = data_list,
  seed = 123,
  chains = 4,
  parallel_chains = 4,
  adapt_delta = 0.95
)

fit$save_output_files()

Our primary task is to determine whether the model can recover the parameters that we used within the simulation. Two focal parameters are the baseline interaction probability \(\alpha_k\) and the detection probability \(p_k\). As a reminder, we generated these values by setting a \(\mu*\) and \(\sigma*\) for each and then drawing them K times.

focal_pars = data.frame(k = factor(1:k), alpha_k, p_k)
focal_pars
  k   alpha_k       p_k
1 1 -1.510214 0.2315809
2 2 -2.398541 0.4549582
3 3 -1.489164 0.3341748

Since we modeled these as non-centered parameters, we need to reconstruct them from the draws. We will do this using the tidybayes package for handling Bayesian models in R.1 Let’s start with \(\alpha_k\). We will generate a predicted \(\hat{\alpha}_k\) by pulling out alpha_z[k], mu_alpha, and sigma_alpha and the computing alpha_k_hat.

library(tidybayes)

alpha_k_draws = fit1 |> 
  spread_draws(mu_alpha, sigma_alpha, alpha_z[k]) |>
  group_by(k) |> 
  mutate(alpha_k_hat = mu_alpha + sigma_alpha * alpha_z) 

We can summarize the posterior distribution in many ways. Here I will plot this distribution and overlay the original values.

Comparison of predicted \(\hat{\alpha}_k\) to the input values of \(\alpha_k\) (magenta \(*\)). Points are posterior means, and bars are 67% and 90% highest density posterior intervals.

\(\hat{\alpha}_k\) falls within the density intervals in all cases, and comes extremely close to the mean for webs 1 and 3. We can now do this same to compute \(\hat{p}_k\), though in this case we need to transform the model parameters from the log-odds scale to a probability scale.

p_k_draws = fit1 |> 
  spread_draws(p_z[k], sigma_p, mu_p) |> 
  group_by(k) |> 
  mutate(p_k_hat = inv_logit(mu_p + sigma_p*p_z))

p_k_draws |> 
  ggplot(aes(x=p_k_hat, y=factor(k))) + 
  stat_pointinterval(color='black', .width=c(0.67,0.9)) + 
  labs(x = TeX("$\\hat{p}_k$"), y = 'k') + 
  geom_point(data=focal_pars, 
             aes(x = p_k, y=k), 
             color = 'white', pch=8, size=4, stroke=2) +
  geom_point(data=focal_pars, 
             aes(x = p_k, y=k), 
             color = 'cyan3', pch=8, size=2, stroke=2)

Comparison of predicted \(\hat{p}_k\) to the input values of \(p_k\) (cyan \(*\)). Points are posterior means, and bars are 67% and 90% highest density posterior intervals.

Again we come very close to recovering the true parameters, with the biggest difference being in web 2. Although we can certainly be more precise, this is still a good sign that the model could, in principle, estimate the latent detection probability \(p_k\).

References

Allesina, Stefano. 2011. “Predicting Trophic Relations in Ecological Networks: A Test of the Allometric Diet Breadth Model.” Journal of Theoretical Biology 279 (1): 161–68.
Petchey, Owen L, Andrew P Beckerman, Jens O Riede, and Philip H Warren. 2008. “Size, Foraging, and Food Web Structure.” Proceedings of the National Academy of Sciences 105 (11): 4191–96.

Footnotes

  1. Details of how to use tidybayes can be found here.↩︎

  2. Allesina (2011) explore additional possibilities, such as the Ratio model, that we discuss in more detail in the next post.↩︎