Bayesian state-space models lead to biased parameter estimates when applied to chaotic population dynamics (or so it seems)

… but does “model-free” forecasting really outperform the “true” model?

I’ve been busy in the last while, which is both the reason why it has been quiet on this blog and why I didn’t manage to write earlier about a note that we submitted three weeks ago to the arXiv.

The note responds to a recent PNAS paper by Perretti and colleauges. The paper sparked our interest because it seemed to show that Bayesian state-space estimation of standard population models fails when population dynamics are chaotic. As a consequence, Perretti et al. found that simple “model-free” time-series methods outperformed forecasts made with structurally correct, fitted models, which lead to their title “Model-free forecasting outperforms the correct mechanistic model for simulated and experimental data”.

We confirmed these findings, however, as we show in our reply, a small modification of the estimation methods is sufficient to remove the problem, questioning whether it is really necessary to promote “model-free” forecasting as superior to “process-based” population models. Our response is available on the arXiv, so a more detailed (and polished) account of the story is there (reading and comments encouraged if you are interested in this topic). Still, I thought it might be nice to sketch shortly what’s going on in this case here, because I think this example illustrated a quite general, but not yet widely appreciated insight in the mechanics of Bayesian or MLE state-space models.

The ecological model we concentrate on here is the time-discrete logistic model (map), to which an observation error is added

$N_{t+1} \sim \left(N_t \cdot r \cdot (1-N_t/K)\right) \cdot \gamma(0,\sigma^{proc})$

Here, $N_t$ is the population size at time $t$, $r$ is the intrinsic growth rate, $K$ is the carrying capacity, and $\gamma$ is a lognormal random variable with the standard parametrization of $\mu$ being the mean on the log-scale. I write $\sim$ to denote the stochastic process on the right side of the equation.

Now, let’s assume that the true population size $N$ is observed with error, leading to an observed population size $N_t^{obs}$ at time $t$ of

$N_t^{obs} \sim \gamma(\mu = log(N_t),\sigma = \sigma^{obs})$

The following figure shows results from a simulation for the true time population time series (solid) and the observations (circles)

The task is to recover the parameters used to create this data by fitting the model structure described above to the observed data points. We used the first 50 time steps for estimation, and the next 50 time steps for the evaluation of the forecast. Because we have observation AND process-error, the standard way to estimate this model would be a MLE or Bayesian state-space model, in which we infer the true underlying population trajectory and the parameters that connect the time steps in this trajectory together. The likelihood in JAGS (btw., the full code to reproduce our results is on the arXiv) would look something like


observed[1] ~ dlnorm(log(pop[1]), ObservationPrecision)

for(Time in 1:(nObs-1))
{

logpop[Time+1] <- log(max(0, r * pop[Time] * (1 - pop[Time] / K) ))

pop[Time+1] ~ dlnorm(logpop[Time+1], ProcessPrecision)

observed[Time+1] ~ dlnorm(log(pop[Time+1]), ObservationPrecision)

}


As Perretti et al. show, this leads to biased estimates. We discuss in our note that this is not completely surprising and has been observed before. The reason is that, with deterministic chaos, population trajectories created from the same model and the same parameters are still hardly better than random after a few time steps (see below 20 simulation in gray compared to a reference, all from the same parameters)

As a result of this “randomness”, the “true”, chaotic parameters don’t give a particularly good fit for all but the first data points. What’s even more important, however, is that the fit (i.e. the likelihood) of far-away data points can be worse for the true parameters than for “wrong” parameters that create stable population dynamics, because the expected variance between predicted and observed is larger for the comparison of two identical parameters than for a chaotic run with a stable run (see Judd, 2007). As a result, assuming flat priors, the inference “decides” that it seems more likely that the randomness in the time-series originates from an observation error than from chaotic population dynamics.

The problem arises because chaotic population trajectories tend to diverge on the long run. Therefore, one can apply a surprisingly simple fix to the problem: split up the time series in smaller chunks (we use 10 chunks a 5 observations) and fit those individually (see Pisarenko, 2004). The combined likelihood for these chunks looks really similar

model{
# Likelihood

for(i in 1:steps)
{
pop[1,i] ~ dunif(0.2,1.5)
observed[1,i] ~ dlnorm(log(pop[1,i]), ObservationPrecision)

for(j in 1:(substeps-1))
{
procerr[j,i] ~ dlnorm(0, ProcessPrecision)

pop[j+1,i] <- max(0, r * pop[j,i] * (1 - pop[j,i] / K) ) * procerr[j,i]

observed[j+1,i] ~ dlnorm(log(pop[1+j,i]), ObservationPrecision)
}
}
}


but the fit is greatly improved. Below the comparison of true vs. estimated for different values of the intrinsic growth rate r.

I think this is an interesting example that is worth thinking about more. The solution used by us seems a quick and dirty, alas, effective fix – I guess it’s unbiased if the priors on pop[1,i] are unbiased, but there might be a certain loss of information (and therefore larger expected variance of the estimator) from dropping some of the known conditional probabilities that connect the data points. Not sure whether ABC methods a la Wood would get more information out of the data; a comparison of bias/variance would be interesting. An important general conclusion, however, is that unmodified state-space modeling approach seem ill-suited to detect chaotic population dynamics if the observation error is unknown, because they is a bias towards misinterpreting chaos as observation error, resulting in parameter estimates that wrongly predict non-chaotic, stable dynamics, particularly for long time series.

Another possible route is of course to have dedicated data/experiments to fix the observation error, so that the model can’t even run into the trouble of replacing chaos with random observation error.

Finally, on a more general note, it is interesting to look at this example in the light of the discussion about whether the reductionist approach does indeed lead to better predictions (I still tend to think yes on the long term, if you count everything in that we learn on the way, but as a counterpoint, see Brian’s thoughts over at dynamical ecology here). Clearly, this is a dispute that can’t be settled by a particular case study, but this example shows that answers may vary drastically depending on the inference technique. Generally, there is hardly a fair comparison possible at the current point in time, because inference methods for correlative models are quite well tested and make very effective use of existing data, while inference for process-based (reductionist) models with large and/or heterogeneous data is still at it’s infancy and probably requires a lot more methodological development and testing. It will be interesting to see which approach will thrive in our “data-rich” era, but I wouldn’t bury reductionism quite yet.

References

Perretti, C. T.; Munch, S. B. & Sugihara, G. (2013) Model-free forecasting outperforms the correct mechanistic model for simulated and experimental data. PNAS

Hartig, Florian, and Carsten F. Dormann. “Does” model-free” forecasting really outperform the” true” model? A reply to Perretti et al.” arXiv preprint arXiv:1305.3544 (2013).

Wood, Simon N. (2010) Statistical inference for noisy nonlinear ecological dynamic systems. Nature, 466, 1102-1104.

Judd, Kevin (2007) Failure of maximum likelihood methods for chaotic dynamical systems. Phys Rev E, 75, 036210.

Pisarenko, V. F. & Sornette, D. (2004) Statistical methods of parameter estimation for deterministically chaotic time series. Phys Rev E, 69, 036122.

4 thoughts on “Bayesian state-space models lead to biased parameter estimates when applied to chaotic population dynamics (or so it seems)”

1. In getting improved fit by breaking the data into smaller chunks, aren’t you basically just changing your objective function to one that’s less ambitious? You’re basically just trying to do one step ahead (or few steps ahead) prediction, rather than many steps ahead.

Like

• In a way yes, but to put this in a more positive way: I’m changing my objective function a way that I don’t assume that the model makes a good long-term prediction for a system with deterministic chaos.

Like