Food Web Generation and Measurement II

Non-Random Detection

community ecology
food webs
hierarchical modeling
Published

November 20, 2025

Introduction

In the previous post, we discussed how the probability of a feeding link between two species depends in part on the probability of detecting the link during the sampling effort. Accounting for this is an important step toward understanding whether the food web structure we are studying is due to ecological processes that underlie community assembly, or some sampling bias (or both).

As a reminder, our existing model treats the probability of an observed link in a web (\(A_{ij,k}\)) as

Detection Model

\[ \begin{align} A_{ij,k} &\sim \text{Bernoulli}(q_{ij,k}) \\ q_{ij,k} &= p_k \pi_{ij,k} \\ p_k &= \text{logit}^{-1}(\mu_{p} + \sigma_p p^{(z)}_k) \\ \pi_{ij,k} &= \text{logit}^{-1}(\alpha_k + u_{i,k} + v_{j,k}) \\ \alpha_k &= \mu_{\alpha} + \sigma_{\alpha} \alpha^{(z)}_k \\ u_{i,k} &= \sigma_u u^{(z)}_{i,k} \\ v_{j,k} &= \sigma_v v^{(z)}_{j,k} \\ p^{(z)}_k, \alpha^{(z)}_k, u^{(z)}_{i,k},v^{(z)}_{j,k} &\sim \mathcal{N}(0,1) \\ \sigma_p,\sigma_{\alpha},\sigma_u, \sigma_v &\sim \text{Exponential}(1) \end{align} \]

Notation Meaning
\(\pi_{ij,k}\) true ecological interaction probability
\(p_k\) detection probability for web \(k\)
\(q_{ij,k}\) probability of observing interaction
\(\alpha_k\) baseline interaction rate in web \(k\)
\(u_{i,k}\) consumer \(i\) feeding generality
\(v_{j,k}\) resource \(j\) feeding vulnerability
\(\mu_p\) overall mean detection
\(\sigma_p\) between-web variation in detection
\(\mu_{\alpha}\) overall mean interaction rate
\(\sigma_{\alpha}\) between-web variation in interaction rate
\(*^{(z)}\) standardized, non-centered latent deviations

Though this is an important advance, our detection probability is just another probability that we estimate. It is treated as a latent, random effect that accounts for false negatives. However, it might be reasonable to assume that detection is non-random: species with particular traits may be more readily observed. For example, if larger species are easier to sample, then we would detect their diets more often. To incorporate this, we make an adjustment to Equation 3. We have a few options here. First, let \(M_i\) and \(M_j\) be the log body masses of consumer \(i\) and resource \(j\).

Consumer Size Model

We define a parameter \(\nu_k\) which controls the effect of consumer body mass on detection.

\[ \begin{align} \text{logit}(p_{i,k}) = \mu_p + \gamma_k M_i + \sigma_p p^{(z)}_{i,k} \end{align} \]

When \(\nu_k > 0\), the feeding links of larger species are detected more often. This model creates a serious confound: body mass effect on links and detection are competing, potentially making the probability \(q_{ij,k}\) unidentifiable.

Size Ratio Model

An second approach that aligns with food web theory (Allesina 2011) might hypothesize that it is the ratio of \(M_i\) to \(M_j\) that influences detection. Since both covariates are on a log scale, we compute the difference between them and estimate the effect \(\nu_k\).

\[ \begin{align} \text{logit}(p_{ij,k}) &= \mu_p + \gamma_k (M_i - M_j) + \sigma_p p^{(z)}_{ij,k} \end{align} \]

This does not magically remove the identifiability problem but it slices the variation differently such that we can attribute link variation associated with size ratio vs. detection that is associated with consumer body size.

Size Mismatch Model

Similar to the size-ratio model, a third option would be to focus on the absolute difference between the species body sizes. This means something quite difference biologically.

\[ \begin{align} \text{logit}(p_{ij,k}) &= \mu_p + \gamma_k |M_i - M_j| + \sigma_p p^{(z)}_{ij,k} \end{align} \]

Each of the detection mechanisms contain very different assumptions about how detection scales body size.

  1. Consumer Size Model: This individual-level mechanism suggests that probability scales with the body size of the consumer, regardless of the size of the prey. Large predators are easier to observe and thus their feeding links are easier to detect.
  2. Size Ratio Model: This edge-level mechanisms suggests that the probability of observing a feeding link depends on the ratio of consumer to resource body size. Observing large predators feeding on smaller prey are more likely, whereas a predator which is much smaller than it prey are ecological unusual and difficult to detect. The detection probability of such unusual events is proportionally lower.
  3. Size Mismatch Model: This edge-level mechanism assumes that detection probability depends on the absolute difference between predators and prey. This mechanism does not imply that size mismatches are uncommon, only that such interactions leave evidence that is more difficult to detect.

Simulation

To further investigate these mechanisms, we can simulate their effect on detection probability.

# Setup
set.seed(666)

inv_logit = function(x) exp(x) / (1 + exp(x))

# Parameters
N = 50
mu_p = -2
sigma_p = 0.25
nu_k = 1

# Body mass
mass = exp(seq(log(0.01), log(1000), length.out = N))  # mass on raw scale
M = log10(mass)  # log10 body mass

# Pairwise matrices
Bi = matrix(M, N, N)  # consumer
Bj = t(Bi)            # resource

# Covariates
ratio = Bi - Bj
mismatch = abs(Bi - Bj)

# Noise
eps = matrix(rnorm(N*N, 0, sigma_p), N, N)

# (1) Consumer Size Model
pM = inv_logit(mu_p + nu_k * M + rnorm(N, 0, sigma_p))
pM_mat = matrix(pM, N, N, byrow = TRUE)
diag(pM_mat) = NA

# (2) Size Ratio Model
pR_mat = inv_logit(mu_p + nu_k * ratio + eps)
diag(pR_mat) = NA

# (3) Size Mismatch Model
pA_mat = inv_logit(mu_p + nu_k * mismatch + eps)
diag(pA_mat) = NA

# (4) Random Model - uniform random probabilities
p_rand_mat = matrix(runif(N * N, min = 0, max = 1), N, N)
diag(p_rand_mat) = NA


# Data for plotting
dat = data.frame(
  consumer_mass_log10 = as.vector(Bi),
  resource_mass_log10 = as.vector(Bj),
  size_ratio = as.vector(ratio),
  size_mismatch = as.vector(mismatch),
  p_consumer = as.vector(pM_mat),
  p_ratio = as.vector(pR_mat),
  p_mismatch = as.vector(pA_mat), 
  p_random = as.vector(p_rand_mat)
)

First, we can consider the matrices produced by these models. Each cell contains the probability of detecting an interaction between the two species.

# Function to create a raster plot
library(latex2exp)
Warning: package 'latex2exp' was built under R version 4.3.3
plot_raster = function(df, zvar, title) {
  ggplot(df, aes(x = resource_mass_log10, y = consumer_mass_log10, fill = .data[[zvar]])) +
    geom_tile() +
    scale_fill_viridis_c(na.value = "white", name = zvar, option = 'A') +
    coord_fixed() +
    labs(
      x = TeX("Resource ${log}_{10}$ mass"),
      y = TeX("Consumer ${log}_{10}$ mass"),
      title = title
    ) +
    theme(legend.position = 'none')
}

# Plot each model
p1 = plot_raster(dat, "p_random", "Random Model")
p2 = plot_raster(dat, "p_consumer", "Consumer Size Model")
p3 = plot_raster(dat, "p_ratio", "Size Ratio Model")
p4 = plot_raster(dat, "p_mismatch", "Size Mismatch Model")


(p1 + p2) / (p3 + p4)

Consumer Size vs. Size-Ratio

To probe into the implications of these models and their potential confounds, we will focus on the consumer size and size-ratio models. We state the full model as follows

\[ \begin{align} A_{ij,k} &\sim \text{Bernoulli}(q_{ij,k}) & &\text{Likelihood} \\ q_{ij,k} &= \pi_{ij,k} \cdot p_{ij,k} & &\text{Adjusted probability} \\ \text{logit}(\pi_{ij,k}) &= \alpha_k + u_{i,k} + v_{j,k} + \beta_k M_{i,k} & &\text{True link probability} \\ \text{logit}(p_{ij,k}) &= \mu_p + \gamma_k(\text{log} \ M_i - \text{log} \ M_j) & &\text{Detection probability} \end{align} \]

Let’s imagine a \(3 \times 3\) matrix of simulation scenarios.

Null Weak Strong
Null \(\gamma_k = 0\), \(\beta_k = 0\) \(\gamma_k = 0\), \(\beta_k = 1\) \(\gamma_k = 0\), \(\beta_k = 3\)
Weak \(\gamma_k = 1\), \(\beta_k = 0\) \(\gamma_k = 1\), \(\beta_k = 1\) \(\gamma_k = 1\), \(\beta_k = 3\)
Strong \(\gamma_k = 3\), \(\beta_k = 0\) \(\gamma_k = 3\), \(\beta_k = 1\) \(\gamma_k = 3\), \(\beta_k = 3\)

References

Allesina, Stefano. 2011. “Predicting Trophic Relations in Ecological Networks: A Test of the Allometric Diet Breadth Model.” Journal of Theoretical Biology 279 (1): 161–68.